Common mistakes while performing RAID recovery
Certain mistakes, if made, may render data unrecoverable. Based on many cases we have worked on over the years, we compiled the following list
for your reference. Note that
after mistakes have been made, data may still be recoverable by using our fee-based RAID recovery
service. Recoverability depends on the types of mistakes and subsequent user actions.
The most common mistakes are:
- Errors made while repairing a degraded RAID 5.
- Chkdsk being run on one of the disks.
- RAID 5 initialization being performed.
- A RAID intended to be a RAID 5 but mistakenly configured as a RAID 0.
Error while repairing a RAID 5
This is by far the most common mistake. A RAID 5 continues to function in degraded mode after one disk
fails. The user is supposed to replace the bad disk with a new disk and perform a rebuild to regenerate the lost data using parity data
on the remaining disks. When
the rebuild is complete, the RAID will be restored to its normal state.
Things that can go wrong include:
- Instead of the bad disk, a good disk is inadvertently replaced. When the good disk is disabled to prepare for replacement, the
RAID immediately fails because it cannot operate with two disks offline. At this time data still can be
recovered using File Scavenger with or without our RAID recovery service. However, more mistakes
may lead to permanent data loss.
- The number of disks is incorrectly specified during the rebuild. Many RAIDs are configured with a "hot spare" which is not part of the RAID.
For example, a six-disk configuration with a hot spare disk is actually a 5-disk RAID. If the RAID is rebuilt as a six-disk RAID, data will
become corrupt. When this occurs, sometimes data can be recovered using our RAID recovery service. Other
times data may be lost permanently.
The number of disks in a RAID 5 can be computed using the following formula:
Number of disks in a RAID 5 = (size of logical RAID disk / size of each physical disk) + 1
For example, if a RAID 5 is made up of
20-GB disks and the logical RAID disk is 60 GB, it is a 4-disk RAID.
- The rebuild is interrupted by an external event such as power loss, human errors, etc.
- The RAID is rebuilt with the member disks out of the original order because of change in disk configuration.
- The RAID is rebuilt with a different stripe (block) size.
Chkdsk being run on one of the disks
Chkdsk is a Windows utility that uses simple algorithms to recover data from a corrupt disk. Chkdsk can be started by the user or automatically
by Windows if it detects disk corruption at boot time. (Windows will ask for confirmation before starting chkdsk but
will start it anyway if no response is received after a number of seconds.) Chkdsk is not RAID-aware. Its algorithms only work
in the simplest cases. In general, it must be avoided.
If chkdsk attempts to repair one or more member disks, the RAID data patterns will be destroyed, usually beyond recovery.
Performing RAID 5 initialization.
If a RAID 5 is inadvertently reinitialized, all data will be lost. Reinitialization usually takes hours and will write zeros over
the entire disks. There will be no data left to be recovered.
A RAID intended to be RAID 5 but mistakenly configured as RAID 0
It is not uncommon for a RAID intended to be a RAID 5 to be erroneously built as a RAID 0.
When a disk fails and cannot be physically repaired, data is lost because RAID 0 provides no redundancy.