Everything is a fsck'ing RAID problem

Got myself a small quick job yesterday. A customer had a server with raid-1 array. A Dell Perc box running Sarge with a mirror that was broken and runnning on one remaining disk.

We already recovered from file system corruption, data loss, and some other fine things last week. Al we had to do was plugging in a new disk and let the mirror rebuild.

Well, at least , that’s what raid should be about. But as Kris already reports, not only DNS, but also raid is quite funky.

Dell seems to agree with Kris. In short, I afterwards found a forum comment from someone who confirmed the issue where initializing a new disk had the funny side effect of resetting existing “containers” (= raid array) 50% of the times you tried this exercise. Not to say, this is big fun when all you wanted to do was putting back some redundancy in a degraded mirror.

Enjoy hardware RAID. Be stuck wih the controller bios interface. Quite a spartan one I must say. Enjoy a simple interface with not too much choice – or was it not enough choice?

I quickly decided, when reinstalling was all we could do, to define each of the two disks, as a simple volume. They still were exported by that funky scsi controller, but once I had a separate sda and sdb, I was ready to let Debian manage them.

At first I was quite optimistic. It was the first install I did with a freshly downloaded Etch. Software raid md devices and LVM volumes all you can eat.

Let me be short. The Debian installer, even in Etch, is funky. Luckily you can always ctrl-alt-F2 to a shell to have some more control and forcing some stuff manually. But in the meantime, that installer is full of bugs.

I remain deadly sure I did configure the /boot partition (on a md, not lvm device) with the proper mount point. Turns out, and I noticed this again at a later reinstall, the Debian installer insists on resetting stuff like that. This is Not Nice.

Another funky problem is when the Debian Installer kernel cannot handle some SCSI routines to let the running kernel know some new md devices were created. Which is not very handy when you need to put those pv‘s in a vg during the same setup. Setup which refuses to write to disk because you didn’t define a root partition yet.

O well. Besides this funky music, linux software raid remains a lot more flexible.

    You can read those partitions from any (operating) system with a recent Linux kernel. No data lock-in.
    It’s no problem to create a mirror with one disk. Especially when you have some data left you need on the other disk, it’s nice to know you can start the raid array without needing to already reformat that disk.
    Did I mention flexibility?

O, some nice article I found, when unrelatedly browsing tfw today. An interesting read on md devices, which seems quite complete on the debian raid subject.