Cisco UCS Non Interactive Diagnostic Tool

Document created by grewilki on Feb 26, 2016Last modified by grewilki on Nov 3, 2016
Version 2Show Document
  • View in full screen mode

                                                                                                                                                      

Cisco UCS Non Interactive Diagnostic Tool

 

You can use diagnostics tools to diagnose hardware problems with your Cisco servers. The user interface displays the status of the test run and examines log files for troubleshooting hardware issues. UCS Server Configuration Utility (SCU) provides is an application that provides Graphical User Interface for running Snapshot diagnostics tests on a standalone Cisco C-Series server.

 

With NI-IOD python script, user can run the same snapshot diagnostics tests on multiple C-Series servers. The script can be run from any Linux box with any version of python that supports multiprocessing.

The snapshot test results are copied to configured remote host

 

Download SCU ISO from cisco.com

 

Server Configuration Utility (SCU) iso is available at cisco.com for download

download1.png

 

Download the latest version of iso. Although SCU is not dependent on server model, select UCS C240M4 and download SCU from that page.

        download2.png

 

Prepare the tool

 

After downloading the iso save it to share location. This can be a cifs/nfs/http share.

 

 

Copy the diagnostic tool to a linux server where you want to run the script

 

 

How to run the tool

 

Now configure the multiserver_config file with the address information of the server where you want to run diagnostic tool. Please check Appendix A for sample of multiserver_config file. Once multiserver_config file is ready, please use the below cli to run the tool. The "-l" option is not required, but useful if you'd like to log results.

run_tool.png

How to check if tool is running properly

 

Once diagnostic tool kicks in, this will boot the server with diagnostic ISO and you will see below screen if you log into KV console of the server.

scu.png

 

Once the server boot with SCU iso, the below screen will show up. Please don’t click on anything. The diagnostic tool is running in background.

scu2.png

You might see something similar like below while test is running. Again please don’t click on anything. You may even close the KVM console.

scu3.png

How to check the Results

 

Once the test finishes, the script reboots the server and a log file will be created on the remote server based on your configuration. The name of the log file is combination of server model, Serial number, date and time.

results.png

Appendix B has the sample results file

 

Appendix A

 

Sample Config file (multiserver_config)

Screen Shot 2016-02-25 at 5.47.34 PM.png

Appendix B

Start of snapshot data. Time: Wed Feb 24 19:25:08 EST 2016

 

-----------------------------------------------------------------------

Server Inventory

-----------------------------------------------------------------------

Chassis/Baseboard summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Manufacturer: Cisco Systems Product ID: UCSC-C220-M4L

Version: 74-12419-01

Serial Number: FCH1827V19W

Server UUID: 53B7B1C6-7B47-4939-9080-125D01AEBF22

Number Of PSU: 2 Number Of Processors: 2 Total Memory: 32768 MB

BIOS version: C220M4.2.0.9b.0

BIOS vendor: Cisco Systems, Inc. Number of PCI adapters: 2

 

CIMC summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Firmware Version: 2.0(10.126) Mac Address: f4:0f:1b:1e:b2:5c IP Address: 10.29.131.118

 

Processor summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Number Of Processors: 2

Processor Socket: 1 (Core count: 18, Core enabled: 18), Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz Processor Socket: 2 (Core count: 18, Core enabled: 18), Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz

 

Memory summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Total Memory: 32768 MB DIMM populated: 4

 

DIMM 1: Locator: NODE 0 CHANNEL 0 DIMM 0, Size: 8192 MB, PartNumber: M393A1G40DB0-

CPB , Manufacturer: 0xCE00, SerialNumber: 01E90931

DIMM 2: Locator: NODE 0 CHANNEL 1 DIMM 0, Size: 8192 MB, PartNumber: M393A1G40DB0-

CPB , Manufacturer: 0xCE00, SerialNumber: 01E9084F

DIMM 3: Locator: NODE 1 CHANNEL 0 DIMM 0, Size: 8192 MB, PartNumber: M393A1G40DB0-

CPB , Manufacturer: 0xCE00, SerialNumber: 01E90859

DIMM 4: Locator: NODE 1 CHANNEL 1 DIMM 0, Size: 8192 MB, PartNumber: M393A1G40DB0-

CPB , Manufacturer: 0xCE00, SerialNumber: 01E9085C

 

Storage summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Controller 1: MegaRAID SAS-3 3108 [Invader], Type: RAID bus controller

-------------

Vendor: LSI Logic / Symbios Logic

 

Disk 1: Description: ST1000NM0001, Vendor: SEAGATE, Size: 931.512 GB, Type: SAS Disk 2: Description: ST1000NM0001, Vendor: SEAGATE, Size: 931.512 GB, Type: SAS

 

Controller 2: RAID, Type: Mass storage device

-------------

Vendor: Cypress

Disk 1: Description: trymirror, Vendor: CiscoVD, Size: 29.719238 GB, Type: SCSI Disk Controller 3: Cisco USB Composite Device-1, Type: Mass storage device

-------------

Vendor: Avocent

 

Disk 1: Description: , Vendor: , Size: , Type: DVD reader Disk 2: Description: , Vendor: , Size: , Type: SCSI Disk Disk 3: Description: , Vendor: , Size: , Type: SCSI Disk Disk 4: Description: , Vendor: , Size: , Type: DVD reader Disk 5: Description: , Vendor: , Size: , Type: SCSI Disk

 

Controller 4: Intel Corporation, Type: SATA controller

-------------

Vendor: Intel Corporation

 

 

 

PCI Adapter summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Total adapters: 2

Adapter 1: SlotID:HBA, Card: RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02)

 

Power supply summary:

-----------------------------------------------------------------------

-----------------------------------------------------------------------

Total power supply: 2

PSU 1: Name: PSU1, Manufacturer: Cisco Systems Inc, Part Number: 341-0591-01, Serial: LIT182008HP PSU 2: Name: PSU2, Manufacturer: NA, Part Number: NA, Serial: NA

 

----------------------------------------------------------------------

Server Inventory SerialNo-PID Data

----------------------------------------------------------------------- 000000000c2,DIMM_A1   0xCE00,FCH1827V19W,UCSC-C220-M4L

000000000c2,DIMM_B1 0xCE00,FCH1827V19W,UCSC-C220-M4L

000000000c2,DIMM_E1   0xCE00,FCH1827V19W,UCSC-C220-M4L

000000000c2,DIMM_F1   0xCE00,FCH1827V19W,UCSC-C220-M4L

0002Z1N49NLQ,SEAGATE SAS DISK,FCH1827V19W,UCSC-C220-M4L

0002Z1N4BPJ1,SEAGATE SAS DISK,FCH1827V19W,UCSC-C220-M4L

012345678900,CiscoVD SCSI Disk DISK,FCH1827V19W,UCSC-C220-M4L NA, DVD reader DISK,FCH1827V19W,UCSC-C220-M4L

NA, SCSI Disk DISK,FCH1827V19W,UCSC-C220-M4L NA, SCSI Disk DISK,FCH1827V19W,UCSC-C220-M4L NA, DVD reader DISK,FCH1827V19W,UCSC-C220-M4L NA, SCSI Disk DISK,FCH1827V19W,UCSC-C220-M4L

LIT182008HP,UCSC-PSU1-770W,FCH1827V19W,UCSC-C220-M4L NA,NA,FCH1827V19W,UCSC-C220-M4L

xxxxxx,Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz,FCH1827V19W,UCSC-C220-M4L xxxxxx,Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz,FCH1827V19W,UCSC-C220-M4L xxxxxx,PCI  ADAPTER,FCH1827V19W,UCSC-C220-M4L

 

 

-----------------------------------------------------------------------

Server diagnostics Quick test results:

-----------------------------------------------------------------------

 

Test name

 

result

---------

 

------

  1. 1. Processor Test

PASSED

 

  1. 2. Cache Validation Test

PASSED

 

  1. 3. Memory Test

PASSED

 

  1. 4. VDisk4 (quick)

PASSED

 

  1. 5. PDisk1 (quick)

PASSED

 

  1. 6. PDisk0 (quick)

PASSED

 

  1. 7. Video Memory Test

PASSED

 

  1. 8. Interface1 (selftest)

PASSED

 

  1. 9. Interface0 (selftest)

PASSED

 

  1. 10. Interface1 (linktest)

FAILED

 

  1. 11. Interface0 (linktest)

FAILED

 

  1. 12. QPI Link Test

PASSED

 

  1. 13. QPI Traffic Test

PASSED

 

  1. 14. CIMC Self Test

FAILED

 

  1. 15. LSI backup battery Test (ctrlId 0)

N/A

 

  1. 16. LSI RAID Adapter Test (ctrlId 0)

PASSED

 

  1. 17. Chipset Test

PASSED

 

  1. 18. Enclosure Test

N/A

 

 

-----------------------------------------------------------------------

 

-----------------------------------------------------------------------

Server Probe data

-----------------------------------------------------------------------

 

Faults:

------

  1. (CIMC) SEL fullness percentage: (100.00), sensor reading is critically high

 

Warnings:

---------

  1. (PSU) PSU2 is not installed in the slot

-----------------------------------------------------------------------

 

-----------------------------------------------------------------------

End of snapshot. Time: Wed Feb 24 19:25:09 EST 2016

 

-----------------------------------------------------------------------

Attachments

Outcomes