Hardware and RAID configuration:
This was an 8x4TB QNAP RAID-5
NAS holding two
thin-provisioned LVM volumes,
each with total capacity of about 13 TB.
One drive failed. During multiple rebuilding attempts, the customer encountered other complications.
Eventually the data became inaccessible.
The LVM headers were still intact. They contained the configuration of two thin-provisioned logical volumes with capacity 12.7
and 12.6 TB. The thin LVM metadata was stored near the end of the drives.
The metadata contained two B-trees, one for each logical volume.
There are three layers of virtualization:
- RAID logical storage space to physical drive space.
- LVM logical space to RAID space.
- Logical volume space to LVM space.
Each layer must be "devirtualized" by a table that maps virtual offsets to logical/physical offsets.
Unfortunately the B-tree structures were partially lost.
Therefore the trees could not be traversed top-down. Instead they were partially reconstructed bottom-up by gathering the remaining
The most difficult task was to determine which of the two B-trees a particular leaf node belongs to. We spent two weeks
writing a program that used LVM space allocation patterns and metadata patterns to determine B-tree affiliation. The program
exceeded all expectations. It accurately assigned each leaf node to the correct B-tree.
We quickly determine the RAID settings and devirtualize the RAID layer. Then we were able to reconstruct the first volume. The second volume
was more difficult. The recovery rate was just a little over 20%. We had to apply file carving
techniques to improve the results.
The recovery rate for the first volume was about 80%. The metadata of the second volume was much more corrupted. After multiple iterations of
file carving operations, we achieved 50% recovery.
Read the in-depth technical paper Recovering thin-provisioned LVM volumes.