Difference between revisions of "Replacing failed disk on Cluster 0"

From DISI
Jump to: navigation, search
(How to check if Disk failed)
 
(2 intermediate revisions by one user not shown)
Line 1: Line 1:
 
== How to check if Disk failed==
 
== How to check if Disk failed==
Check for the light on disk:
+
===Check for the light on disk===
  
 
Solid Yellow => Fail
 
Solid Yellow => Fail
Line 13: Line 13:
 
* Find replacement with a similar disk with the same specs
 
* Find replacement with a similar disk with the same specs
 
* Carefully unscrew the disk from disk holder (if the disk holder part on the replacement is the same then you don't have to).
 
* Carefully unscrew the disk from disk holder (if the disk holder part on the replacement is the same then you don't have to).
 +
 
== How to check if disk is failed or install correctly==
 
== How to check if disk is failed or install correctly==
 
1. Log into gimel as root  
 
1. Log into gimel as root  
 
  $ ssh root@sgehead1.bkslab.org
 
  $ ssh root@sgehead1.bkslab.org
2. Log into the machine that you determined from earlier as root
+
2. Log in as root to the machine that you determined from earlier  
 
  $ ssh root@<machine_name>
 
  $ ssh root@<machine_name>
 
  Example: RAID 3,6,7 belongs to nfshead2
 
  Example: RAID 3,6,7 belongs to nfshead2

Latest revision as of 10:42, 11 September 2019

How to check if Disk failed

Check for the light on disk

Solid Yellow => Fail

Blinking Yellow => Predictive Failure (going to fail soon)

Green => Normal

Replace disk instruction

  • Determine what machine the disk below to
  • Press the red button on the disk to turn it off.
  • Gently pull a little bit out (NOT all the way) and wait for 10 sec until it stops spinning before pulling all the way out.
  • Find replacement with a similar disk with the same specs
  • Carefully unscrew the disk from disk holder (if the disk holder part on the replacement is the same then you don't have to).

How to check if disk is failed or install correctly

1. Log into gimel as root

$ ssh root@sgehead1.bkslab.org

2. Log in as root to the machine that you determined from earlier

$ ssh root@<machine_name>
Example: RAID 3,6,7 belongs to nfshead2

3. Run this command

$ /opt/compaq/hpacucli/bld/hpacucli ctrl all show config
Output Example:
Smart Array P800 in Slot 1                (sn: PAFGF0N9SXQ0MX)
  array A (SATA, Unused Space: 0 MB)
     logicaldrive 1 (5.5 TB, RAID 1+0, OK)
     physicaldrive 1E:1:1 (port 1E:box 1:bay 1, SATA, 1 TB, OK)
     physicaldrive 1E:1:2 (port 1E:box 1:bay 2, SATA, 1 TB, OK)
     physicaldrive 1E:1:3 (port 1E:box 1:bay 3, SATA, 1 TB, OK)
     physicaldrive 1E:1:4 (port 1E:box 1:bay 4, SATA, 1 TB, OK)
     physicaldrive 1E:1:5 (port 1E:box 1:bay 5, SATA, 1 TB, OK)
     physicaldrive 1E:1:6 (port 1E:box 1:bay 6, SATA, 1 TB, OK)
     physicaldrive 1E:1:7 (port 1E:box 1:bay 7, SATA, 1 TB, OK)
     physicaldrive 1E:1:8 (port 1E:box 1:bay 8, SATA, 1 TB, OK)
     physicaldrive 1E:1:9 (port 1E:box 1:bay 9, SATA, 1 TB, OK)
     physicaldrive 1E:1:10 (port 1E:box 1:bay 10, SATA, 1 TB, OK)
     physicaldrive 1E:1:11 (port 1E:box 1:bay 11, SATA, 1 TB, OK)
     physicaldrive 1E:1:12 (port 1E:box 1:bay 12, SATA, 1 TB, OK)
  array B (SATA, Unused Space: 0 MB)
     logicaldrive 2 (5.5 TB, RAID 1+0, OK)
     physicaldrive 2E:1:1 (port 2E:box 1:bay 1, SATA, 1 TB, OK)
     physicaldrive 2E:1:2 (port 2E:box 1:bay 2, SATA, 1 TB, Predictive Failure)
     physicaldrive 2E:1:3 (port 2E:box 1:bay 3, SATA, 1 TB, OK)
     physicaldrive 2E:1:4 (port 2E:box 1:bay 4, SATA, 1 TB, OK)
     physicaldrive 2E:1:5 (port 2E:box 1:bay 5, SATA, 1 TB, OK)
     physicaldrive 2E:1:6 (port 2E:box 1:bay 6, SATA, 1 TB, OK)
     physicaldrive 2E:1:7 (port 2E:box 1:bay 7, SATA, 1 TB, OK)
     physicaldrive 2E:1:8 (port 2E:box 1:bay 8, SATA, 1 TB, OK)
     physicaldrive 2E:1:9 (port 2E:box 1:bay 9, SATA, 1 TB, OK)
     physicaldrive 2E:1:10 (port 2E:box 1:bay 10, SATA, 1 TB, OK)
     physicaldrive 2E:1:11 (port 2E:box 1:bay 11, SATA, 1 TB, OK)
     physicaldrive 2E:1:12 (port 2E:box 1:bay 12, SATA, 1 TB, OK)
  array C (SATA, Unused Space: 0 MB)
     logicaldrive 3 (5.5 TB, RAID 1+0, Ready for Rebuild)
     physicaldrive 2E:2:1 (port 2E:box 2:bay 1, SATA, 1 TB, OK)
     physicaldrive 2E:2:2 (port 2E:box 2:bay 2, SATA, 1 TB, OK)
     physicaldrive 2E:2:3 (port 2E:box 2:bay 3, SATA, 1 TB, OK)
     physicaldrive 2E:2:4 (port 2E:box 2:bay 4, SATA, 1 TB, OK)
     physicaldrive 2E:2:5 (port 2E:box 2:bay 5, SATA, 1 TB, OK)
     physicaldrive 2E:2:6 (port 2E:box 2:bay 6, SATA, 1 TB, OK)
     physicaldrive 2E:2:7 (port 2E:box 2:bay 7, SATA, 1 TB, OK)
     physicaldrive 2E:2:8 (port 2E:box 2:bay 8, SATA, 1 TB, OK)
     physicaldrive 2E:2:9 (port 2E:box 2:bay 9, SATA, 1 TB, OK)
     physicaldrive 2E:2:10 (port 2E:box 2:bay 10, SATA, 1 TB, OK)
     physicaldrive 2E:2:11 (port 2E:box 2:bay 11, SATA, 1 TB, OK)
     physicaldrive 2E:2:12 (port 2E:box 2:bay 12, SATA, 1 TB, OK)
  Expander 243 (WWID: 50014380031A4B00, Port: 1E, Box: 1)
  Expander 245 (WWID: 5001438005396E00, Port: 2E, Box: 2)
  Expander 246 (WWID: 500143800460A600, Port: 2E, Box: 1)
  Expander 248 (WWID: 50014380055E913F)
  Enclosure SEP (Vendor ID HP, Model MSA60) 241 (WWID: 50014380031A4B25, Port: 1E, Box: 1)
  Enclosure SEP (Vendor ID HP, Model MSA60) 242 (WWID: 5001438005396E25, Port: 2E, Box: 2)
  Enclosure SEP (Vendor ID HP, Model MSA60) 244 (WWID: 500143800460A625, Port: 2E, Box: 1)
  SEP (Vendor ID HP, Model P800) 247 (WWID: 50014380055E913E)