Reply to post: Re: Hold on a second...

Flash banishes the spectre of the unrecoverable data error

Gordan

Re: Hold on a second...

"If (i was using RAID and) recovery did go wrong, I'd expect it to recover everything it could, and apologise profusely for the odd file which was lost. If instead it wigs out and fails then you're better off not having it in the first place."

Except that we are talking about block level corruption, which sits underneath the FS level. Unless you are running a full-stack solution like ZFS it won't be trivial to even find out which file's block was occupying the block that failed to scrub out. With any block level corruption, there is a possibility that the entire FS might end up hosed, even if your RAID implementation is clever enough (and many aren't) to give up on the errant block and continue rebuilding the rest of the data.

The problem is compounded by the fact that most disks today come with no Time Limited Error Recovery (TLER), and those that do don't have it enabled by default. So when an unreadable block gets encountered, the disk will repeatedly try to read it while ignoring all other commands. Eventually the higher layers will time out the commands, and kick the disk out. At which point you will have lost the whole second disk from the array, and thus will quite likely need to restore the whole lot from a backup. With TLER, the disk will time out the command much sooner, before it gets kicked out of the array or off the controller.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon