Panic update

From: Scott Walde <scott_at_no.spam.please>
Date: Tue Jun 05 2007 - 08:29:18 CST

Well, thank God it turned out that drive sdd had absolutely nothing
wrong with it. For the record, these are 4 SATA drives connected to a
built-on sil3114 controller on a Gigabyte motherboard. I used dd_rescue
(this program rocks!) to first make a copy of sdc.

1. connected a fresh drive on sdd's cable.
2. dd if=/dev/sdc of=/dev/sdd bs=512 count=2 (couldn't remember how much
I really needed to copy the partition table. This was enough.)
3. fdisk /dev/sdd
4. write, exit. (this causes the kernel to load the new partition table)
5. dd_rescue -v /dev/sdc1 /dev/sdd1

This took over 4 hours, and reported a number of errors. It was obvious
why it was booted from the array. I then did the same with the original
sdd and a fresh drive on sdc's cable. This time I got no errors. So, I
tried assembling the raid again with sda, sdb and sdd. (I used the
original sdd at this time, as it didn't have any read errors when dd-ing
it.)

1. mdadm --assemble /dev/md0 --force /dev/sda1 /dev/sdb1 /dev/sdd1
It restarted the raid with no reported errors!
At this point I backed up a few critical files. (Yes, I did already
have backups, but they were probably a couple weeks old.) Then I added
in a new drive at sdc1:
2. mdadm --manage /dev/md0 --add /dev/sdc1
so it could start rebuilding.
I ran an fsck.ext3 -f /dev/md0 and it reported no errors!
Finally, after sdc was rebuilt, I pulled the drive in sdd, replaced it
with a fresh drive, and rebuilt onto that. (Yes, I'm still a bit paranoid.)

So, I'm happy to report that it appears I didn't lose a single bit of data!

ttyl
srw
Received on Tue Jun 5 08:29:23 2007

This archive was generated by hypermail 2.1.8 : Tue Jun 05 2007 - 08:29:25 CST