Question RAID NAS - Do you keep a spare drive in case of failure?

Monster900

Established Member
Joined
Dec 30, 2011
Messages
410
Reaction score
103
Points
83
I have recently been persuaded here that once you have a NAS arrangement that includes RAID resilience to disk failure (with backup) that it makes sense to run disks to the point where they do fail, rather than swap them out routinely after four years run time? Do people here have a spare disk, on the shelf, ready to go or do you just order a new disk when one fails and run the system in a degraded state until it arrives?

As always, thanks for your thoughts.

M
 
Just order a new disk when required. Price of HDs is constantly dropping. 1TB used be best price per GB, then 2TB took over and then 3TB and now 4TB is best price per GB. That period of time has been from 2008. So probably in 2 years 6TB will be the best price per GB.
 
In commercial environments we often create RAID arrays with a "hot spare" in the enclosure such that as soon as the controller detects a failed HDD it automatically shuts it down a starts recovering onto the hot spare. In larger deployments like SAN's with dozens/hundreds of discs in them we might have a few hot spares spun ready to go at a moments notice - usually automatically. But that's extra cost in HDD and energy that's not doing anything useful.

In a domestic install, you might argue it's not worth the expense of an extra HDD either spun up as a hot spare or sitting on a shelf as a cold spare.

I submit it's a value judgement based on how important your data is and how much impact would be felt by loosing access to it. For example, in a business it's argued we need data available and up to date constantly - think about your bank balance for example, who'd tolerate loss of access to that for few minutes let alone days. Conversely, if one lost access to my music and videos for a few days whilst I wait for Amazon to overnight a replacement disc to me it's hardly the end of the world.

EDIT - I guess a point that should also be made of course is that a RAID array in a degraded state is very much "at risk" that should a second failure occurring before the replacement has been deployed and recovery completed, means you'll loose everything and need to recover from backup. Thusly, as we often assert in this forum, RAID is not a backup, one should still make backups; at least of anything irreplaceable and/or important and/or that needs to be "up to date."
 
Last edited:
I keep an "on the shelf" cold spare (I'm running a 4 bay Synology NAS with the disks in SHR). Everything is backed up, but when there's 30+ TB of stuff, that takes some while to reload. I'd rather have the cold swap disk ready to drop in and start a rebuild. I use shucked USB desktop drives, and by buying the 5 drives when prices were at a low point the extra cost isn't excessive.
 
Actual disk failures are normally few and far between, barring a "bad batch" or whatever you care to call it. With that in mind, there will most likely be 2 scenarios:
1) the cost of the drive you have sitting on the shelf will be higher now than when you have a failure, or
2) for the same or probably lesser amount, you can get a better, larger drive, assuming your NAS supports it. You could then consider a one-at-a-time disk replacement, and expand your storage.

Option 1 saves you money, however you do have to wait for the disk to arrive.
Option 2 allows you to increase capacity. You don't need to do it right away, but it's something to think about.
 
I don't see the need to keep a spare drive; you can always order one for next day delivery when you need one.

A RAID array only protects against individual drive failures. It won't protect against the damaging effects of, say, a power surge or ransomware.

If your data is valuable you should have additional backups to guard against these things, and so running the RAID array in a degraded state for a day or two really doesn't matter because you're not relying on it anyway.

I run two RAID servers, one of which backs itself up to the other overnight. It also backs up to a USB drive; I have three of these and rotate them regularly so I always have one which is bang up to date, one in a different physical location, and one which is old enough to (probably!) pre-date any data corruption that I may not have noticed as soon as it happened.
 
I have 3 x 8 HDD RAID arrays in RAID6. No hot spares or cold spares. To lose data, i need to have 3 disc failures in the same array. That is seriously unlucky, but everything is backed up.
 
Thanks for all the responses to this. They have been really helpful.

On balance, I have decided not to have a 'cold' spare drive on the shelf as the system should be sufficiently resilient to tolerate a wait of a day or two for a replacement to be delivered. The only exception may be if a real bargain turns up, then I may may be tempted to buy it.

Thanks again.
 
Just thought I'd add my two cents. I have a hot spare now as my chassis is large enough to support it, but I didn't used to when I had a slightly smaller chassis. Back then I didn't have a cold spare either for the same reason as you.

The array ran without problems for about 2 years but a drive was predicted to fail so of course I replaced it with another drive same size make and model but, different release version (WD Red 3TB). Caused me no end of problems. Caused the array to fail repeatedly blah blah blah.

when I got on to LSI Support they explained the new drive although same size had different cylinders sectors even though from the others, and this would always cause problems and the array would fail again and keep randomly failing (as it had been) if I kept this drive in place.

Long story short, I sent this drive back to ebuyer (though they maintained this explanation was rubbish and it would make no difference) but they did refund me. After some searching I managed to find a drive the same as the originals on Amazon, rebuilt the array and have never had problem since. (it now sits in a different server, I use for vmware storage, but is still going strong after well... it has to be 8-10 years now.

Anyhoo, LSI did say that this is a rare occurrence that strictly speaking shouldn't happen, but it did. Of course new RAID cards are far better and I don't imagine I'd get this problem with new one, but since then I've always bought an extra drive just to be sure I get the same model as all the rest.
 
There is an argument that a RAID array not only can, but should, be built with drives of different types.

The reason is to defend against design flaws. If one make/model of drive has a defect which causes it to fail prematurely, chances are the other drives in the array won't share the same defect and won't fail at the same time.

It's not a hypothetical problem. Here's an example, where a firmware bug causes a drive to predictably fail at 40,000 hours. This, of course, would happen to every drive in the array at exactly the same time:

 
I've been running a ReadyNas NV2+ for several years, 4x2TB disks. I don't keep a spare and if one went you could get one delivered next day.
 
I have recently been persuaded here that once you have a NAS arrangement that includes RAID resilience to disk failure (with backup) that it makes sense to run disks to the point where they do fail, rather than swap them out routinely after four years run time? Do people here have a spare disk, on the shelf, ready to go or do you just order a new disk when one fails and run the system in a degraded state until it arrives?

As always, thanks for your thoughts.

M

I had an instance a few years ago where one of the WD Reds populating my original NAS failed.

At the time I had also decided not to bother keeping a hot or cold spare at home after roughly calculating the odds of ever needing an urgent replacement vs the additional expense. (After all, the chances of a simultaneous double failure or two drives failing within say, 48 hours, are pretty remote (but obviously not impossible).)

As with many others I decided to simply order any replacements I may need on an as required basis. However, when my priority next day replacement arrived it was defective, (which then required a ridiculously lengthy RMA process under the circumstances). While that RMA was still going through I paid out for a second next day replacement & that too was faulty on arrival!! (I order important components like HDD's from a well known UK retailer & I certainly wouldn't Amazon Prime such components or you're literally asking for trouble with the way Amazon package things or their delivery agents throw things around ;).) Eventually, a third drive arrived which functioned normally & I was able to successfully rebuild the volume, thankfully with no harm done at all. (I was also eventually reimbursed for the two earlier faulty replacements but that took nearly two weeks to go through as well & I had to essentially pay out for three drives in three days then wait to be reimbursed.)

Lesson learned!!! If a second drive had failed outright or the volume degraded on a second drive in the time I was messing around waiting for a functioning replacement, it would have been a whole other story. (Obviously, I backup any vital/important data which all fits on to external HDD's, but backing up TB's of multimedia is another matter entirely & ultimately very expensive to purchase that level of redundancy for a typical home user.)

Ever since that incident I kept a cold spare just in case & I picked up said spare at nearly half the normal price on a snap deal anyway, so it's definitely worth having IMHO. I've also since acquired additional spares after upgrading another NAS too. (It's the completely random & unforeseen problems that invariably end up biting you squarely on the backside with this type of scenario ;).). I recently had another older drive fail in another NAS & thankfully I was able to put a spare straight in to rebuild the volume immediately without several days of sweating ;).

Granted, it's not cheap to just buy a spare to sit in a draw/on a shelf that ultimately may never be required anyway, but how much is all the data that you can't/don't backup from the NAS or have additional redundancy for ultimately worth to you should the "improbable" actually happen?

RAID/SHR gives you a degree of redundancy & essentially buys you time if you act swiftly, but, it's those "what ifs" that'll get ya ;).

Entirely subjective which way you go on this. It's the added peace of mind a spare gives you should something you just can't control suddenly crop up really :).

Enjoy.
 
Your scenario, I can only assume, was because you were using RAID. If you weren't using RAID and you had a backup (which you should have if you are using RAID as the data must be of significant importance to use RAID) then there would have been no issue?

Backup > RAID unless you run a business where data access is 24/7/365 and/or data needs extremely fast read\write speeds ie database.

I haven't had one Hard Drive delivered yet that has failed on fit - that will be out of about 200+.

I had an instance a few years ago where one of the WD Reds populating my original NAS failed.

At the time I had also decided not to bother keeping a hot or cold spare at home after roughly calculating the odds of ever needing an urgent replacement vs the additional expense. (After all, the chances of a simultaneous double failure or two drives failing within say, 48 hours, are pretty remote (but obviously not impossible).)

As with many others I decided to simply order any replacements I may need on an as required basis. However, when my priority next day replacement arrived it was defective, (which then required a ridiculously lengthy RMA process under the circumstances). While that RMA was still going through I paid out for a second next day replacement & that too was faulty on arrival!! (I order important components like HDD's from a well known UK retailer & I certainly wouldn't Amazon Prime such components or you're literally asking for trouble with the way Amazon package things or their delivery agents throw things around ;).) Eventually, a third drive arrived which functioned normally & I was able to successfully rebuild the volume, thankfully with no harm done at all. (I was also eventually reimbursed for the two earlier faulty replacements but that took nearly two weeks to go through as well & I had to essentially pay out for three drives in three days then wait to be reimbursed.)

Lesson learned!!! If a second drive had failed outright or the volume degraded on a second drive in the time I was messing around waiting for a functioning replacement, it would have been a whole other story. (Obviously, I backup any vital/important data which all fits on to external HDD's, but backing up TB's of multimedia is another matter entirely & ultimately very expensive to purchase that level of redundancy for a typical home user.)

Ever since that incident I kept a cold spare just in case & I picked up said spare at nearly half the normal price on a snap deal anyway, so it's definitely worth having IMHO. I've also since acquired additional spares after upgrading another NAS too. (It's the completely random & unforeseen problems that invariably end up biting you squarely on the backside with this type of scenario ;).). I recently had another older drive fail in another NAS & thankfully I was able to put a spare straight in to rebuild the volume immediately without several days of sweating ;).

Granted, it's not cheap to just buy a spare to sit in a draw/on a shelf that ultimately may never be required anyway, but how much is all the data that you can't/don't backup from the NAS or have additional redundancy for ultimately worth to you should the "improbable" actually happen?

RAID/SHR gives you a degree of redundancy & essentially buys you time if you act swiftly, but, it's those "what ifs" that'll get ya ;).

Entirely subjective which way you go on this. It's the added peace of mind a spare gives you should something you just can't control suddenly crop up really :).

Enjoy.
 
Point well
I haven't had one Hard Drive delivered yet that has failed on fit - that will be out of about 200+.

I've had brand new drives fail many times, more times than I can count. I've also had replaced drives fail repeatedly. On one occasion the brand new and replacement Seagate Barracuda drives were faulty when delivered so many times Seagate were themselves baffled and upgraded the whole array up to constellation drives for free! So it it certainly happens.

Raid doesn't cause arrays to fail, drives fail its that simple. Having a hot/cold spare is simply the easiest way to deal with the issue. Even when you have backups a failed array is a performance and availability hit, just like the failure of any other drive.
 
Having a NAS for 10+ years is I would buy one when needed. I used to have a Drobo drive some time ago, but moved away to the more expensive but reliable Synology products.
 
Your scenario, I can only assume, was because you were using RAID. If you weren't using RAID and you had a backup (which you should have if you are using RAID as the data must be of significant importance to use RAID) then there would have been no issue?

Backup > RAID unless you run a business where data access is 24/7/365 and/or data needs extremely fast read\write speeds ie database.

I haven't had one Hard Drive delivered yet that has failed on fit - that will be out of about 200+.
Point well


I've had brand new drives fail many times, more times than I can count. I've also had replaced drives fail repeatedly. On one occasion the brand new and replacement Seagate Barracuda drives were faulty when delivered so many times Seagate were themselves baffled and upgraded the whole array up to constellation drives for free! So it it certainly happens.

Raid doesn't cause arrays to fail, drives fail its that simple. Having a hot/cold spare is simply the easiest way to deal with the issue. Even when you have backups a failed array is a performance and availability hit, just like the failure of any other drive.

My thoughts exactly @jimscreechy :).
 
I had a UNRAID setup where a disk failed, thought no problem, ordered replacement disk swapped out and couldn't get the data back despite the best efforts on UNRAID support. It's not always the disk that fails is the lesson :)
 
I had a UNRAID setup where a disk failed, thought no problem, ordered replacement disk swapped out and couldn't get the data back despite the best efforts on UNRAID support. It's not always the disk that fails is the lesson :)

That's why RAID is not a backup.
 
That's why RAID is not a backup.
Yup, that’s why I backup to another device and really important stuff goes to cloud as well.
 

The latest video from AVForums

TV Buying Guide - Which TV Is Best For You?
Subscribe to our YouTube channel
Back
Top Bottom