Remember that this only makes sense after you've done it once... :)
Test PC:
PII-350, 384Mb RAM, and Adaptec 2940U2 SCSI controller, and 3 18Gb Seagate drives.
Test Mac:
Blue & White G3, 256MB RAM, Adaptec 2940 SCSI controller, 3 8GB SCSI drives.
These instructions have been tested on various flavors of OpenLinux, Red Hat, Mandrake, and now Yellow Dog Linux for PowerPC.
IMPORTANT NOTE: According to the howtos, if you're going to use IDE drives for this, you should only have one driver per channel. Slave drives will kill the performance of the RAID, so factor the purchase of a couple of IDE controllers into your equations. I have personally tested the Promise UDMA-100 cards in a RAID configuration, and they work very well.
RAID-5 requires 3 hard drives of the same size, so you should install those and make sure they work before starting this process.
General Partitioning notes:
Since RAID-5 isn't supported by most installers, you must first install Linux to one of the drives. Later on we'll convert that drive to become a part of the RAID. If you have at least 128MB RAM skip the swap partitions. We'll create a swapfile on the RAID later so that if a drive dies the box won't crash. Don't split the mount points up among partitions as you normally would. Put '/' on one of the large Linux partitions and leave the small 50Mb partitions and the large Linux partitions on the other 2 drives empty. To make our job easier later, create two 50Mb partitions at the front of the first 2 drives and leave those partitions empty for now.
Mac partitioning notes:
You may see lots of strange Apple partitions on your disk. As long as you're not dual-booting with MacOS go ahead and delete them. It won't hurt anything, and you can always put them back later with Apple's disk utilities.
IMPORTANT: Don't delete partition 1! The first partition of a Mac disk is the partition table, so that would cause all kinds of havoc.
In addition to the Linux partitions, allocate a 10MB Yaboot boot partition at the beginning of the first two disks. This is where your bootloader will go.
My PC partition structure looks like:
/dev/sda1 - 50Mb, Linux, empty
/dev/sda2 - 17Gb, Linux, /
/dev/sdb1 - 50Mb, Linux, empty
/dev/sdb2 - 17Gb, Linux, empty
/dev/sdc1 - 17Gb, Linux, empty
/dev/sda1 - Apple partition map
/dev/sda2 - 10MB, Apple Bootstrap
/dev/sda3 - 50MB, Linux, empty
/dev/sda4 - 8GB, Linux, /
/dev/sdb1 - Apple partition map
/dev/sdb2 - 10MB, Apple Bootstrap
/dev/sdb3 - 50MB, Linux, empty
/dev/sdb4 - 8GB, Linux, empty
/dev/sdc1 - Apple partition map
/dev/sdc2 - 8GB, Linux, empty
Mac Kernel Notes:
You'll need a recent PPC kernel for this to work on a Mac. These are available at www.ppckernel.org. I used 2.4.20-benh10. You'll also need a new version of Yaboot, available at penguinppc.org. I used 1.3.10. If you're accustomed to building kernels on Intel you generally use 'make bzImage' as your final step. Unfortunately compressed kernels aren't supported on PPC, so you'll have to use 'make vmlinux' instead.
Once the recompile is complete, move the kernel into place and edit grub/lilo/yaboot accordingly. Then reboot and check that all your hardware is seen.
Now we'll create the /etc/raidtab file that will configure your RAID devices. On the PC this should contain the following:
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/sdb2
raid-disk 1
device /dev/sdc1
raid-disk 2
device /dev/sda2
failed-disk 0
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
persistent-superblock 1
chunk-size 32
device /dev/sda1
raid-disk 0
device /dev/sdb1
raid-disk 1
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/sdb4
raid-disk 1
device /dev/sdc2
raid-disk 2
device /dev/sda4
failed-disk 0
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
persistent-superblock 1
chunk-size 32
device /dev/sda3
raid-disk 0
device /dev/sdb3
raid-disk 1
- raiddev /dev/md0 - specifies that
we're creating raid device /dev/md0
- raid-level 5 - specifies that this
is a RAID-5 array
- nr-raid-disks - specifies the
number of *active* disks in the array
- nr-spare-disks - specifies the
number of spare disks in the array (spare disks are used automagically if
an active disk fails)
- persistent-superblock 1 - puts a
block on each RAID device that contains info about its position in the
array (among other things). Ever wonder what happens if you physically
re-arrange drives in the array by accident, or switch a cable to the wrong
drive? Without this, the array wouldn't know and would go to pieces. On
the PC this also allows booting from the array, which we'll get to later.
- parity-algorithm left-symmetric -
specifies the algorithm used to spread the parity info among multiple
disks. Don't ask me how it works, I haven't a clue.
- chunk-size 32 - specifies the
chunk size the array writes in. this has an affect on performance, but
since I don't understand all that too well I just use what the docs
recommended.
- device /dev/sdxx - specifies the
device name of each partition to be included in the array.
- raid-disk x - specifies a unique
number assigned to each device in the array.
- failed-disk x - specifies a device
that is in a failed state. In this case, we specify our current non-raid
boot device so that the raid doesn't try to incorporate it into the array
yet. That Would Be Bad(tm).
Now let's create our arrays. This part is easy:
mkraid /dev/md0
mkraid /dev/md1
cat /proc/mdstat
' to check the status of
your RAID devices.(md, by the way, stands for 'multiple devices'. It's the
kernel's shorthand for RAID devices.) NOTE: RAID autodetection steps are PC only, Mac users should skip this section and resume reading at the 'make filesystems' step.
Now that we know our arrays are working, let's stop them and setup auto-detection. Auto-detection makes use of the 'persistent superblock' that we enable in /etc/raidtab. It installs that superblock on each RAID device, and once we've set the partition type correctly, the kernel will see all our RAID devices at boot.
raidstop /dev/md0
raidstop /dev/md1
fdisk /dev/sda
p
t
1
fd
w
This
lists the parition table, selects a partition to work on, and then sets the
partition type to RAID. It then writes the new partition table to disk. Do this
to *each partition* to be used in the array. Then reboot and watch the kernel
auto-detect your arrays.
Now, we'll make
filesystems on our arrays. We'll make '/boot' ext2 and '/' Reiserfs. You can
also use other filesystems. For the Mac I tested with Ext3. mke2fs /dev/md1
mkreiserfs /dev/md0
mkdir /raid
Now,
we'll copy our stuff over to the new '/boot' parition:
mount -t ext2 /dev/md1 /raid
cp -a /boot/* /raid
umount /dev/md1
mount -t reiserfs /dev/md0 /raid
for i in `find / -type d -maxdepth 1|egrep -v 'boot|proc|raid|^/$'`
do
cp -a $i /raid
done
mkdir /raid/proc /raid/boot
/raid/etc/fstab
, modifying the mount point
for '/' and adding one for '/boot'. Something like: /dev/md0 / reiserfs defaults 1 1
/dev/md1 /boot ext2 defaults 1 1
For PC, create a LILO configuration with a backup setting to test things:
umount /raid
vi /etc/lilo.conf
boot=/dev/sda1
install=/boot/boot.b
lba32
prompt
delay=50
timeout=50
default=linux-raid
image=/boot/vmlinuz-2.4.2-raid
label=linux-raid
root=/dev/md0
read-only
image=/boot/vmlinuz-2.4.2-raid
label=fallback
root=/dev/sda2
read-only
/sbin/lilo
to setup LILO on your first partition.
Note the 'fallback' entry. If something goes wrong you can still boot back to
your non-RAID configuration by typing 'fallback' at the LILO prompt. Now, copy
your lilo.conf to lilo.sda and lilo.sdb. We need one for each mirror of the
RAID-1 partition. The reason is that we're going to install LILO on each so
that if the primary disk fails, we can still boot. Essentially, we're making
LILO redundant. Change /etc/lilo.sda so that the line reads 'boot=/dev/sda
' and change /etc/lilo.sdb
so that the line reads 'boot=/dev/sdb
' and then install LILO onto
the MBR of each drive: /sbin/lilo -C /etc/lilo.sda
/sbin/lilo -C /etc/lilo.sdb
Also, note the 'device=' line. That will be different depending on your machine. Run
ofpath /dev/sda
to get the Open Firmware
path for your first SCSI drive. Put that in your 'device=' line. Also important is the 'partition=' line. This should be the number of the partition that contains your kernel. In this case, the array /dev/md1 contains our kernel and it's on partition 3.
Now
cp /etc/yaboot.conf /etc/yaboot.sda.conf
and cp /etc/yaboot.conf /etc/yaboot.sdb.conf
. Change the 'boot=' line in
the second file to /dev/sdb2 and the 'device=' line to the result of ofpath /dev/sdb
. Run ybin -C /etc/yaboot.sdb.conf
and ybin -C /etc/yaboot.sda.conf
to install Yaboot on both
Bootstrap partitions. Example yaboot.conf:
# ybin options
boot=/dev/sda2
magicboot=/usr/local/lib/yaboot/ofboot
delay=10
defaultos=linux
enablecdboot
# yaboot options
init-message="\nWelcome to Yellow Dog Linux\n\n"
timeout=50
default=linux
# yaboot images
image=/vmlinux-2.4.20-ben10
label=linux
root=/dev/md0
partition=3
append="md=0,/dev/sda4,/dev/sdb4,dev/sdc2 md=1,/dev/sda3,/dev/sdb3"
device=/pci@80000000/pci-bridge@d/ADPT,2940U2B@4/@0:
image=/boot/vmlinux-2.4.20-ben10
label=fallback
root=/dev/sda4
partition=4
Reboot
and try it out.
Mac Note: The Blue
& White G3 I used seems to have a pretty dumb Open Firmware. If you unplug
the primary drive to test the array, be aware that the firmware takes a very
long time to figure it out. In my case, it made me type 'mac-boot' before it would
even fail over. Not very smart. I've been told that the G4's are better, but I
haven't verified that. If all goes well, you've just booted from the array. Now it's time to add that old partition into your RAID-5 array and enable redundancy. First, edit
/etc/raidtab
and change the label 'failed-disk
' to 'raid-disk
'. This tells the RAID the partition is
OK for use now. Then add it to the array by running: raidhotadd /dev/md0 /dev/sda2
(that's /dev/sda4 in our Mac configuration)
watch cat /proc/mdstat
' to see it build the
redundancy. You should see a line that says something about 'recovery' and an
estimated time for completion. Once it finishes you are running a fully
redundant system. You should be able to survive a hard drive failure without
data loss. Now it's time to set up our swapfile. It will exist inside the array so that a dead drive won't crash the machine. Generally you should set up a swapfile that is 2 times the size of your RAM, though for machines with lots of memory this may not be practical. First, figure out how many blocks you'll be using. This is figured out by taking the RAM count in MB and multiplying by 1024 (to convert to KB) and then doubling it. In my case I have 256MB, so 256*1024*2 is 524288.
Then
cd /
and dd if=/dev/zero of=swapfile bs=1024 count=524288
. This will give 512MB of
swapspace in /. Now
mkswap /swapfile
and swapon /swapfile
to create and activate the
swapspace. Next we'll add our new swap space into /etc/fstab so that it will be used automatically. Add a line to /etc/fstab that looks like this:
/swapfile swap swap
defaults 0 0
And we're done.
Written by Aaron Grewell on 11-April-2003.
No comments:
Post a Comment