Rebooting Drives
I was speaking to a drive manufacturer the other day that again made the statement that when they get supposedly failed hard drives back from storage system vendors, a high percentage of them (80 percent as I recall) work just fine. All they needed was a power cycle and they came back to life. Of course, the remedy for hard drive failure is the same remedy we use for our desktops when something goes wrong. Reboot.
December 4, 2009
I was speaking to a drive manufacturer the other day that again madethe statement that when they get supposedly failed hard drives backfrom storage system vendors, a high percentage of them (80 percent as Irecall) work just fine. All they needed was a power cycle and they cameback to life. Of course, the remedy for hard drive failure is the sameremedy we use for our desktops when something goes wrong. Reboot.
What's the big deal to you? If you have a drive failure you can simplyreplace the drive with a new one. Its the manufacturer's problem,right? If you have a spare drive sitting on the shelf ready to replacea failed drive then yes, but many data centers don't have extra drives.If you don't have a failed drive you have to let your hot spare takeover, let the RAID rebuild happen, order a new drive, pack up the oldone and send it back. This all takes time and unless you can getsomeone else to do it for you, that's time you probably don't have. Ifthe drive could just be rebooted and returned to operation, even ifthat meant you still had to go through the RAID rebuild, at least youwouldn't have to deal with going through an RMA process to send a driveback that is likely not really bad.
Where this gets interesting is if the drive can be rebooted beforeforcing a RAID rebuild. For example, let's say one of the drives in yourRAID-5 group fails, obviously with XOR calculations you can continue tooperate. If the system would reboot the drive prior to going to theglobal spare and initiated a RAID rebuild, you could save that lengthyrebuild process all together. This would take some intelligence on thepart of the storage system to be able to maintain data availabilitywhile the reboot of the drive happens, but some suppliers are workingon providing this capability.
I'm sure there is some sort of green angle here to make my 'save theplanet' friends happy, too. Think of all the carbon we would save by notshipping drives that aren't bad all over the country. There is alsothe waste involved in manufacturing extra drives that never needed tobe made. I'm not sure what each manufacturer does with the faileddrives but I'm sure they can't resell them as new.
We all know that IT has to do more with less and there are elaboratepresentations from vendors on how their products do that. Taking thesimple reboot concept and implementing it into intelligent storagesystems could go a long way in increasing productivity and increasingperformance.
About the Author
You May Also Like