Return-Path: Date: Fri, 17 Apr 1998 12:12:57 +0200 To: Eric van Dijken , "linux-raid@vger.rutgers.edu" Subject: Re: Looking for root RAID0 sollution Reply-To: Martin Schulze X-Orcpt: rfc822;linux-raid@vger.rutgers.edu Sender: owner-linux-raid@vger.rutgers.edu Precedence: bulk X-Loop: majordomo@vger.rutgers.edu Hi, funny, yesterday I have finished my little howto about that. Please send me comments. On Fri, Apr 17, 1998 at 10:25:39PM +0400, Eric van Dijken wrote: > I'am looking for a root filesystem RAID0 sollution. How to set up RAID ------------------ Using filesystems with RAID (Redundant Array of Inexpensive Disks) has many advantages. First there is speed. RAID combines several disks and reads/writes chunks from the disks in a sequence. That way it can reach transfer speeds up to three times that of the slowest disk, maybe even more. For technical information on RAID please refer to . To do RAID with Linux you first need a kernel with appropriate support. Linux 2.0.x supports linear and striping modes (the latter is also known as RAID-0). Linux 2.1.63 the kernel also support RAID-4 and RAID-5. To use either of them you need to have special tools installed. For linear and RAID-0 you need the mdutils package. To use RAID 4/5 you need to have the raidtools package installed and a kernel version higher than 2.1.62. With RAID (not linear) you'll get best results if you use partitions with exactly the same sizes. The RAID driver will work with different sizes, too, but is less efficient as you may imagine after reading some RAID documents. 1 Setting up RAID Setting up RAID for normal filesystems such as /var, /home or /usr is quite simple. First you need to partition your disks. After you've done that you need to tell the RAID subsystem how you want to organize the partitions, e.g. with mdcreate -c4k raid0 /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 mdcreate -c8k raid0 /dev/md1 /dev/sdd1 /dev/sde1 /dev/sdf1 This creates two RAIDs, each of them consisting of three partitions. The first one has a chunk size of 4k while the second one uses 8k chunks. These commands will create appropriate entries in /etc/mdtab. The next step is to activate these devices with: mdadd -ar From now on you may refer to /dev/md0 and /dev/md1 as block devices carrying your filesystems. Now you may create your filesystem on the new devices just and add them to /etc/fstab just as usual. Debian GNU/Linux is configured to initialize and activate any RAID at boot time so you should not run into problems. Please note that you need to have the RAID drivers compiled into the kernel, modules might not work. 2 Swapping over RAID The kernel has native support for distributing swap space over several disks. Just add all swap partitions to /etc/fstab and use 'swapon -a' to activate all of them. The kernel uses striping (RAID-0) for them. Here's a sample setup /dev/sda3 none swap sw /dev/sdb3 none swap sw /dev/sdc3 none swap sw 3 Root filesystem on RAID The use of RAID on the root filesystem is a little bit tricky. The problem is that LILO can't read and boot the kernel if it is not stored linear on the disk (like it is on ext2 or dos). The solution is to put the kernel on a different filesystem that doesn't use RAID. This way LILO would boot the kernel but the kernel itself would be unable to mount the root filesystem because its RAID subsystem isn't initialized. So you need a mechanism to call some programs, at least mdadd, before the kernel tries to mount the root file system and after the kernel is loaded. The only way to get this is to use the initial ramdisk also known as initrd. General information about initrd may be found in the Documentation directory in the kernel source tree. If the Linux kernel uses initrd it mounts the ramdisk as root file system and executes /linuxrc if it is around. After this is finished the kernel continues its boot process and mounts the real root filesystem. The old / (from the initrd) will be moved to /initrd if that directory is available or umounted otherwise. The initrd file is a simple rootdisk. It should contain all the files that are needed for processing the /linuxrc file. This includes a working shell if it's a shell script and all tools that are used in this script. This might include a working libc with ld.so and tools, too. After you have initialized RAID from /linuxrc you need to tell the kernel where its root filesystem resides. As it uses the initrd it might not know. There is an easy interface for this using the proc filesystem. You only need to echo the appropriate device number to /proc/sys/kernel/real-root-dev and the kernel continues with that setting. I use a root-filesystem on a RAID and this is what I have done. There might be a better solution, I don't know. Mine works and looks logical - at least to me. :-) As lilo isn't able to boot from a non-linear block device (such as RAID) you need to reserve a small partition with the kernel on it. I've decided to use a 10MB partition which I use as /boot and put stuff on it. 10MB is plenty of space for only one kernel and initrd, currently my system only uses 2.5 MB of it. So /etc/lilo.conf still points to /boot/vmlinuz-2.0.34 in this setup. Now, decide what needs to be done in the /linuxrc script. You only need to activate RAID and tell the kernel where your root filesystem resides. The following script should do it #! /bin/ash if [ -s /etc/mdtab -a -f /sbin/mdadd ] then echo "Preparing system for rootfs raid." /sbin/mdadd -ar /bin/mount -t proc /proc /proc echo 0x900 > /proc/sys/kernel/real-root-dev /bin/umount /proc else echo "No mdtab or mdadd found." fi You may use any block device as root filesystem. 0x900 stands for major number 9 and minor number 0 which is /dev/md0. Now make a list of binaries needed and additional files. Of course you need some device files in /dev/ as well. To get the /linuxrc script working at all, you need to have /dev/tty1. The other devices depend on your /etc/mdtab file. You will at least need /dev/md0. Binaries: ash, mount, umount, mdadd Files: mdtab, fstab and mtab and for safety passwd Devices: tty1, depending on /etc/mdtab I use this mdtab: # mdtab entry for /dev/md0 # mdtab entry for /dev/md0 /dev/md0 raid0,4k,0,93f5553f /dev/hda2 /dev/hdb2 # mdtab entry for /dev/md1 /dev/md1 raid0,8k,0,3ffaa1d8 /dev/hda4 /dev/hdb4 Therefore I have created these block devices: /dev/hda2 /dev/hda4 /dev/hdb2 /dev/hdb4 /dev/md0 /dev/md1 /dev/md2 /dev/md3 You can use the mknod program to create the device files, e.g. with the following command for tty1. mknod dev/tty1 c 4 1 Ok, but how does one create the initrd file? The best thing you can do is to create the directory /tmp/initrd and install everything in it. When you're finished you determine the diskspace it uses (du -s) and create the initrd itself. The following command would create a 1M initial ramdisk. This is what I use. dd if=/dev/zero of=/tmp/initrd.bin bs=1k count=1024 mke2fs /tmp/initrd.bin mount -o loop /mnt /tmp/initrd.bin You now have to decide if you're going to use dynamic linked binaries and the libc+linker equipment or compile your own static binaries and the libc+linker equipment or compile your own static binaries. I've decided to use the latter and compiled ash, mount/umount and mdadd myself. This can be done very easily. You only have to fetch the source, unpack it with "dpkg-source -x foobar.dsc", cd to foobar-version, edit debian/rules and add LDFLAGS="-static" to the $(MAKE) call inside the build target. After that you only need to issue "make -f debian/rules build" for all needed packages and install the resulting binaries into the appropriate directories within /tmp/initrd. Don't forget to create the /proc directory or mount will fail. The fstab and mtab files can be empty. They will only be read, not written to, but they need to exist. For the /etc/passwd file it's sufficient to include the root user. After you have copied everything from /tmp/initrd to the ramdisk, umount it and move it to /boot/initrd.bin. Now you need to tell lilo to load the kernel and the ramdisk. That's no problem, just use a record in /etc/lilo.conf like the following: image=/boot/vmlinuz-2.0.34 initrd=/boot/initrd.bin label=linux read-only Issue the command "lilo" and you're nearly done. As the RAID subsystem is now configured at boot stage before any /etc/init.d scripts were issued you should disable the mdadd call in /etc/init.d scripts. Regards, Joey -- / Martin Schulze http://home.pages.de/~joey/ / *** Fatal Error: Found [MS-Windows], joey@linux.de / / repartitioning Disk for Linux ... /