this is obsolete doc -- see http://doc.nethence.com/ instead
Settings up NetBSD RAIDframe
Introduction
This guide has two parts, the first one describes RAID-1 for the system to boot in it. The second pard descibes RAID-1 and RAID-5 array creations without the ability to boot on it -- just for storage (but still auto-configurable with no flat configuration file). As those are auto-configurable, the configuration files are not put into /etc/raidX.conf but into /var/tmp/raidname.conf to make sure the drives may be switched to another system without loosing the array configurations. The drawback on this is that when a component fails and after a reboot, you won't be able to see what component failed exactly, as the failed drive isn't being auto-configured any more at boot time hence isn't being not identified by RAIDframe system.
Requirements
Make sure you've got RAIDframe enabled in the kernel (by default),
dmesg | grep -i raid
Make sure the disks you want to use for RAID are identical (maybe it works otherwise too assuming you fix the disklabels, but here we go anyway),
dmesg | grep ^wd
dmesg | grep ^sd
Make sure the SMART statuses are alright (also check with the BIOS messages at power on),
atactl wd0 smart status
atactl wd1 smart status
Eventually disable write caching if you don't have an Uninterruptible Power Supply (UPS) (by the way, what happens if there is a kernel panic?),
dkctl wd0 getcache
dkctl wd1 getcache
dkctl wd0 setcache r
dkctl wd1 setcache r
dkctl wd0 setcache r save
dkctl wd1 setcache r save
dkctl wd0 getcache
dkctl wd1 getcache
*** Part A -- Setting up RAID-1 for system boot ***
Note. we will making sure the RAID partition starts at block 63, not 2048 (???) and has the maximum size.
Note. for those booting disks, disklabel may not be called with wd0 and wd1 argument,
disklabel: could not read existing label
you may have to do this instead,
disklabel wd0a
disklabel wd1a
Preparing the disks
Assuming, you are currently running the system on wd0, we are going to setup RAID-1 on an additional (identical) disk, wd1, boot on it and only then integrate wd0 into the array.
Erase the MBR and DOS partition table on wd1,
#dd if=/dev/zero of=/dev/rwd1d bs=8k count=1
dd if=/dev/zero of=/dev/rwd1d bs=1024k count=1
Configure a BSD partition on the whole disk,
fdisk wd0
fdisk -0ua /dev/rwd1d
answer the questions,
Do you want to change our idea of what BIOS thinks? [n]
sysid: [0..255 default: 169]
start: [0..14593cyl default: 63, 0cyl, 0MB]
#or start: [0..24321cyl default: 2048, 0cyl, 1MB] 63
size: [0..14593cyl default: 234441585, 14593cyl, 114473MB]
#or size: [0..24321cyl default: 1985, 0cyl, 1MB] 390721905 (based on what fdisk wd0 showed, equals total - 63, actually)
bootmenu: []
Do you want to change the active partition? [n] y
active partition: [0..4 default: 0]
Are you happy with this choice? [n] y
Update the bootcode from /usr/mdec/mbr? [n] y
Should we write new partition table? [n] y
Check that both disk have identical DOS partition tables,
cd ~/
fdisk wd0 > wd0
fdisk wd1 > wd1
diff -u wd0 wd1
Note. if there is the "PBR is not bootable: All bytes are identical (0x00)" message it will be fixed at installboot time.
Change a partition to say it is raid,
disklabel -r -e -I wd1
like,
disk: sys0
label: SYS0
a: 234441585 63 RAID
d: 234441648 0 unused 0 0 # (Cyl. 0 - 232580)
or another example,
disk: sys1
label: SYS1
a: 390721905 63 RAID
d: 390721968 0 unused 0 0 # (Cyl. 0 - 387620)
Prepare the RAID-1 flat configuration file,
vi /var/tmp/raidsys.conf
like,
START array
1 2 0
START disks
absent
/dev/wd1a
START layout
128 1 1 1
START queue
fifo 100
and initialize the raid device,
raidctl -v -C /var/tmp/raidsys.conf raid0
#raidctl -v -I `date +%Y-%m-%d-%s` raid0
raidctl -v -I `date +%Y-%m-%d` raid0
this should be quite fast here, since there is only one disk,
raidctl -v -i raid0
CHECK THE SYSTEM LOGS WHILE DOING THE RAID INITIALIZATION!
Setting up the system on RAID
Edit the disklabel for the newly created raid device (partition "a" needs to be at offset 0 to be seen by the bios and be able to boot)
disklabel -r -e -I raid0
say you want 1024MB of RAM (1024MB x 1024 x 1024 / 512 = 2097152 sectors), 234441472 - 2097152 = 232344320,
a: 232344320 0 4.2BSD 0 0
b: 2097152 232344320 swap
d: 234441472 0 unused 0 0 # (Cyl. 0 - 228946*)
or another example (390721792 - 2097152 = 388624640 sectors),
a: 388624640 0 4.2BSD 0 0
b: 2097152 388624640 swap
d: 390721792 0 unused 0 0 # (Cyl. 0 - 381564*)
Note. The 'c' partition doesn't seem to be mandatory (same in the official -- longer -- RAIDframe guide).
Initialize the filesystem as FFSv2, mount, and entirely copy the currently running system to the raid device,
newfs -O 2 /dev/rraid0a
#fsck -fy /dev/rraid0a
mount /dev/raid0a /mnt/
cd /; pax -v -X -rw -pe . /mnt/
note. The copy takes a while.
edit fstab so you can fix the hard drive path,
cd /mnt/etc/
mv fstab fstab.noraid
sed 's/wd0/raid0/g' fstab.noraid > fstab
ls -l fstab*
fstab should now have,
/dev/raid0a / ffs rw,log 1 1
/dev/raid0b none swap sw,dp 0 0
Make sure swapoff is enabled already on the copied system (it disables swap during the shutdown, to avoid parity errors on the RAID device),
grep swapoff /mnt/etc/defaults/rc.conf # should be there already!
#grep swapoff /mnt/etc/rc.conf
Install the boot loader onto that raid disk (first 63 sectors have been kept, remember?), so that it is bootable just as if it weren't raid disk (the raid partition on raid0 is starting at 0, remember?),
/usr/sbin/installboot -o timeout=10 -v /dev/rwd1a /usr/mdec/bootxx_ffsv2
mv /boot.cfg /boot.cfg.bkp
mv /mnt/boot.cfg /mnt/boot.cfg.bkp
Note. It is best to use a temporarily specific timeout for each raid disk so you can very simply and quickly identify at boot time on which disk you are booting! Here 20 for the second and only raid disk for now.
Note. Yes FFSv2 as I initialized the filesystem on raid0a as FFSv2 with newfs.
Note. Also remove /boot.cfg otherwise the timeout in there takes precedence.
Enable RAID auto-configuration and reboot,
raidctl -v -A root raid0
tail -2 /var/log/messages
raidctl -s raid0
cd /
sync
shutdown -r now
"The first boot with RAID"
Go into your BIOS or BBS boot menu and precisely choose the second disk (wd1) to boot on.
Ok, make sure the system is actually running raid0/wd1 and not wd0,
mount
swapctl -l
Now copy the MBR and DOS partition table from wd1 to wd0,
dd if=/dev/rwd1d of=/dev/rwd0d bs=8k count=1
and verify that the DOS partitioning layouts are exactly the same on both RAID components,
fdisk /dev/rwd1d > fdisk.wd1
fdisk /dev/rwd0d > fdisk.wd0
diff -bu fdisk.wd1 fdisk.wd0
Do the same for the BSD disk labels and partitions,
disklabel -r wd1a > disklabel.wd1a
disklabel -R -r wd0a disklabel.wd1a
disklabel -r -e -I wd0a
disk: sys0
and check,
disklabel -r wd0a > disklabel.wd0a
diff -bu disklabel.wd1a disklabel.wd0a
Finally add (first, as spare) wd1 to the RAID-1 array,
raidctl -v -a /dev/wd0a raid0
note. The "truncating spare disk" warning in the system logs is fine.
see ? you should now have a spare. check with,
raidctl -s raid0
then initialize the disk to join the array (this takes a while, come back a few days later!),
raidctl -v -F component0 raid0
note. This should say in the system logs that it is initiating a reconstruction on the available spare disk.
you can interrupt the display,
^C
and get back to it to check the reconstruction progress,
raidctl -S raid0
or see how fast the drives are working,
iostat 5
You can (continue to) use the system and available space for storage but the performance won't be optimal until the reconstruction has finished.
A few days later -- Ready to go
Make sure every component is 'optimal' alright,
raidctl -v -s raid0
Note. If wd0a is still referenced as used_spare just reboot soon enought and it will show as major component.
The bootloader on wd0 should be fine with the previous 'dd' but here we go, it's best to differenciate. Install the bootloader on wd0 again with a specific timeout to identify it at boot time,
/usr/sbin/installboot -o timeout=5 -v /dev/rwd0a /usr/mdec/bootxx_ffsv2
ls -l /boot*
note. Also remove /boot.cfg otherwise the timeout in there takes precedence.
and reboot,
cd /
sync
shutdown -r now
BIOS configuration
Tune the boot sequence to make sure the machine is able to boot on both disks, the second and the first one.
*** Part B -- Setting up a RAID-1 or RAID-5 arry for storage ***
Assuming wd2 and wd3 for this array and creating an array referenced "raid1" since we already have an array named "raid0" the the system.
Preparing the disks
Erase the first few sectors of the targetted RAID disks (MBR and partition tables get lost),
#dd if=/dev/zero of=/dev/rwd2d bs=8k count=1
#dd if=/dev/zero of=/dev/rwd3d bs=8k count=1
dd if=/dev/zero of=/dev/rwd2d bs=1024k count=1
dd if=/dev/zero of=/dev/rwd3d bs=1024k count=1
I don't need a DOS partition table (so there is no use of fdisk) because I don't use RAID to boot the system,
fdisk wd2
fdisk wd3
disklabel -r -e -I wd2
disklabel -r -e -I wd3
change the first BSD partition (a)'s fstype '4.2BSD' to 'RAID'. Using the whole size of the disk is okay as I am not booting on it,
disk: data0
label: DATA0
a: 2930277168 0 RAID
d: 2930277168 0 unused 0 0 # (Cyl. 0 - 2907020)
disk: data1
label: DATA1
a: 2930277168 0 RAID
d: 2930277168 0 unused 0 0 # (Cyl. 0 - 2907020)
Initializing the RAID Device
Create the RAID configuration,
vi /var/tmp/raiddata.conf
For a RAID-1 array e.g. for one row, two columns (two disks) and no spare disk,
START array
1 2 0
START disks
/dev/wd2a
/dev/wd3a
START layout
128 1 1 1
START queue
fifo 100
For a RAID-5 array e.g. for one row, three columns (three disks) and no spare disk,
START array
1 3 0
START disks
/dev/wd2a
/dev/wd3a
/dev/wd4a
START layout
128 1 1 5
START queue
fifo 100
Configure the RAID volume (uppercase -C forces the configuration to take place),
raidctl -v -C /var/tmp/raiddata.conf raid1
Assign a serial number (here UNIX time) to identify the RAID volume,
raidctl -v -I `date +%s` raid1
Initialize the RAID volume (takes a while... several hours for e.g. an 1.5TB RAID-1 array),
time raidctl -v -i raid1
Note. the process is running in background already, you can get back to the prompt if you want,
^C
and eventually see the progress display again by typing,
raidctl -S raid1
or see how fast the drives are working,
iostat 5
CHECK THE SYSTEM LOGS WHILE DOING THE RAID INITIALIZATION!
You can continue while the raid array is being initialized.
Make sure the RAID volume is able to configure itself without the need of the raiddata.conf configuration file,
raidctl -A yes raid1
Note. for a root device ready to boot, we should do '-A root' but this is out of the scope in this part of the guide.
Ready to go
Even though the parity check hasn't finished, you can proceed and use your RAID array already and you can even reboot the system (as long as it finds its configuration in /etc/). The RAID device will just be a a lot more slower during the parity writings; so it's best to let it finish.
In case your RAID volume is larger than 2TB you should get a similar kernel log warning during RAID initialization and at boot time,
WARNING: raid1: total sector size in disklabel (1565586688) != the size of raid (5860553984)
For a <2TB volume just proceed with disklabel and newfs,
dd if=/dev/zero of=/dev/rraid1d bs=1024k count=1
disklabel raid1
newfs -O 2 -b 64k /dev/rraid1a
(or you can proceed with GPT and wedges just like if it were a >2TB disk)
For a >2TB volume proceed with GPT and wedges,
dd if=/dev/zero of=/dev/rraid1d bs=1024k count=1
gpt create raid1
gpt add raid1
gpt show raid1
dkctl raid1 addwedge raid1wedge 34 2930276925 ffs
dkctl raid1 listwedges
and proceed with newfs e.g.,
newfs -O 2 -b 64k /dev/rdk0
*** Part 3 *** Usage and maintainance
Monthly maintainance
Once a month or with monitoring scripts, check,
atactl wd1 smart status
...
dkctl wd1 getcache (should be disabled with no UPS)
...
raidctl -p raid1
raidctl -s raid1
raidctl -S raid1
eventually fix the parity if it's not clean,
raidctl -P raid1
Note. that command is executed at boot every time (/etc/rc.d/raidframeparity).
Eventualy use smartmontools instead of atactl,
echo $PKG_PATH
pkg_add smartmontools
cp /usr/pkg/share/examples/rc.d/smartd /etc/rc.d/
cd /etc/
echo smartd=yes >> rc.conf
rc.d/smartd start
smartctl -l selftest /dev/rwd0d
smartctl -a /dev/rwd1d
smartctl -A /dev/rwd1d
==> lookf for Current_Pending_Sector
Determine the disks identities before changing them:
atactl wd0 identify | grep -i serial
atactl wd1 identify | grep -i serial
atactl wd2 identify | grep -i serial
atactl wd3 identify | grep -i serial
atactl wd4 identify | grep -i serial
For the record, here is a brief summary of the base maintenance commands,
-a /dev/wdXx raidX add a hot spare disk
-r /dev/wdXx raidX remove a hot spare disk
-g /dev/wdXx raidX print the component label
-G raidX print the current RAID configuration
-f /dev/wdXx raidX fail the component w/o reconstruction
-F /dev/wdXx raidX fail the component and initiate a reconstruction on the hot spare if available
-R /dev/wdXx raidX fail the component and reconstruct on it (after it has been replaced)
-B raidX copy back the reconstructed data from spare disk to original disk
Replacing a failing disk on a non-booting array
If you need to replace a drive (if raidctl -s reports a failed component), verify the component's identity,
#raidctl -G raid1
#raidctl -g /dev/wd3a raid1
dmesg | grep ^wd3
atactl wd3 identify | grep -i serial
and replace the drive by checking its serial number.
Now that the new drive is in place, prepare it for RAIDframe,
dd if=/dev/zero of=/dev/rwd3d bs=1024k count=1
disklabel -r -e -I wd3
While the RAID array stays online
Note. the -R method brings that error,
stdout -- raidctl: ioctl (RAIDFRAME_REBUILD_IN_PLACE) failed: Invalid argument
console -- raid1: Device already configured!
==> Proceeding with the hot spare method
Then add it to the raid array as a spare drive,
raidctl -a /dev/wd3a raid1
then reconstruct the disk as part of the array,
raidctl -s raid1
raidctl -F component0 raid1
check with,
raidctl -s raid1
raidctl -S raid1
when it's finished, enable auto-configuration on the drive,
raidctl -A yes /dev/wd3a
and make sure there is no relying /etc/raid*.conf files.
At next reboot the array will show up with both components optimal, no spares. In the meanwhile it's also fine to run as it is.
While the RAID array is offline
Update the RAIDframe array configuration,
raidctl -G raid1
cat /var/tmp/raiddata.conf
START array
1 2 0
START disks
/dev/wd2a
/dev/wd3a
START layout
128 1 1 1
START queue
fifo 100
raidctl -c /var/tmp/raiddata.conf raid1
and reconstruct the disk as part of the array,
raidctl -R /dev/wd3a raid1
check with,
raidctl -s raid1
raidctl -S raid1
when it's finished, enable auto-configuration on the drive,
raidctl -A yes /dev/wd3a
Adding a hot spare to the array
Add the new drive as a hot spare,
raidctl -v -a /dev/wd4a raid1
raidctl -s raid1
fail the component and force the use of the spare disk,
raidctl -F component0 raid1
raidctl -s raid1
watch the reconstruction progress,
raidctl -S raid1
or see how fast the drives are working,
iostat 5
Turn a used spare disk to an array component
Make sure autoconfig is enabled,
raidctl -g /dev/wd2a raid1
raidctl -A yes /dev/wd2a
unconfigure the raidframe device for an instant (it doesn't harm the array),
umount /mount/point/
raidctl -u raid1
reconfigure the array as you wish (the wd2 disk is already reconstructed as it was used as a spare)
#raidctl -G raid1
cat /var/tmp/raiddata.conf
START array
1 2 0
START disks
/dev/wd2a
/dev/wd3a
START layout
128 1 1 1
START queue
fifo 100
raidctl -c /var/tmp/raiddata.conf raid1
shutdown -r now
once rebooted, check that everything is fine (no spare left, just wd2a and wd3a are in optimal state),
raidctl -s raid1
Recover a failing RAID-1 booting array
Recover a damaged single RAID-1 component (move data from wd1 to wd0 when raidctl -F/-R cannot work anymore because of hardware uncorrectable data error). In other words, the array was already non-optimal as it had only one disk, but on top of this, your single drive starts to have serious hardware errors so you can't even reconstruct the shit on a spare drive.
In brief :
- create the raid1 array and partition 'a',
- copy from raid0a to raid1a with pax,
- restart on raid1a,
- erase the raid0 array and change the disk.
Remove the spare disk you need to build another raid array on it,
raidctl -v -r /dev/wd0a raid0
and check,
raidctl -s raid0
Proceed,
dd if=/dev/zero of=/dev/rwd0d bs=8k count=1
fdisk -0ua /dev/rwd0d # ==> active and only partition, accept to rewrite the mbr if needed in the process
disklabel -r -e -I wd0 # ==> partition a becomes RAID
vi /var/tmp/raid1.conf
like,
START array
1 2 0
START disks
/dev/wd0a
absent
START layout
128 1 1 1
START queue
fifo 100
then,
raidctl -v -C /var/tmp/raid1.conf raid1
raidctl -v -I `date +%Y-%m-%d-%s` raid1
raidctl -v -i raid1
note. the parity initialization (-i) is quite fast here, since there is only one disk.
note. the "Error re-writing parity!" error in the logs is just normal, since we got an absent device for RAID-1.
and check,
raidctl -s raid1
Now,
disklabel -r -e -I raid1
newfs -O 2 /dev/rraid1a
mount /dev/raid1a /mnt/
cd /; pax -v -X -rw -pe . /mnt/
vi /mnt/etc/fstab
:%s/raid0/raid1/g
/usr/sbin/installboot -o timeout=10 -v /dev/rwd0a /usr/mdec/bootxx_ffsv2
mv /boot.cfg /boot.cfg.bkp
raidctl -v -A root raid1
raidctl -v -s raid1
cd /
sync
shutdown -r now
note. Also remove /boot.cfg otherwise the timeout in there takes precedence.
on next boot, disable the former and broken raid device,
raidctl -A no raid0
raidctl -v -u raid0
You can now replace the broken disk and proceed with the "The first boot with RAID" section up there in this guide.
TODO
- what about "dkctl wd1 setcache r" on next reboot, still there?
Troubleshooting
If you ever need to remove only the component label (untested),
dd if=/dev/zero of=/dev/rwdXa skip=16k bs=1k count=1
References
16.2. Setup RAIDframe Support: http://www.netbsd.org/docs/guide/en/chap-rf.html#chap-rf-initsetup
22.2. Deleting the disklabel: https://www.netbsd.org/docs/guide/en/chap-misc.html#chap-misc-delete-disklabel
Appendix B. Installing without sysinst: http://www.nibel.net/nbsdeng/ap-inst.html
Chapter 16. NetBSD RAIDframe: http://www.netbsd.org/docs/guide/en/chap-rf.html
Configuring RAID on NetBSD: http://www.thorburn.se/henrik/netbsd/raidhowto.txt
Hitachi 1TB HDD's, NetBSD 6.0.1 and RAID1 - soft errors and clicking noises!: https://mail-index.netbsd.org/netbsd-users/2013/02/07/msg012431.html
How To Fix / Repair Bad Blocks In Linux: http://linoxide.com/linux-how-to/how-to-fix-repair-bad-blocks-in-linux/
How to make backups using NetBSD's RAIDframe: http://www.schmonz.com/2004/07/23/how-to-make-backups-using-netbsds-raidframe/
NetBSD and RAIDframe: http://www.cs.usask.ca/staff/oster/raid.html
NetBSD and RAIDframe History: http://www.cs.usask.ca/staff/oster/raid_project_history.html
Setting up an 8TB NetBSD file server: http://abs0d.blogspot.fr/2011/08/setting-up-8tb-netbsd-file-server.html
Setting up raidframe(4) on NetBSD: http://wiki.netbsd.org/set-up_raidframe/
The adventure of building a 4TB raid 5 under NetBSD 5.1: http://mail-index.netbsd.org/netbsd-users/2011/09/02/msg008979.html