Getting older, not necessarily wiser!
In this article I will be setting up a RAID array in Debian from the terminal. This has been a request I have received on multiple occasions. So I have bumped it up on my project list.
Before going forward there are a few items I want to make sure everyone is aware of,
For demonstration purposes, I will be performing this tutorial in a Virtual Machine. This makes it easier, not having to set up hardware. However the process will be the same on actual hardware.
Need to be aware of how long it takes to setup a RAID array. This is something that gets overlooked in some RAID tutorials. Once you do all the leg work, it can take a lot of time to actually setup a RAID array, especially with larger storage drives. For example, I set one up once with four 1 TB storage drives, and it took about 40 minutes to setup the RAID array. A friend setup a RAID array of several 4 TB drives that too close to three hours to setup. So once you start the process, go do something else while your computer chugs along.
A RAID Array is initially just a big empty disk. Once the RAID array is created you still have to format it.
Fault tolerant RAID arrays suck up storage. Be aware that fault tolerance in a RAID array will suck down one or more drives of storage to achieve that fault tolerance. So remember to subtract that drive from your capacity estimates.
The mdadm program is the administrative tool we will use to create, manage, and monitor RAID arrays. It relies on the md driver, that does the actual disk manipulations. The mdadm tool supports RAID0 (striping), RAID1 (mirroring), RAID4 (striped non distributed parity), RAID5 (striped distributed parity), RAID6 (stripped, extended distributed parity), and RAID10 (Stripped mirrored).
Since we are working with hardware, mdadm requires root access. This can be achieved by; logging in as root, su to the root account, or a user with sudo access. Take your pick, as I am not going to argue security issues or benefits of the various methods.
There are a coupe of ways to determine if mdadm is installed already. The easiest way is just to run it with the -V option. You will either get “command not found” if it is not installed, or the version number. If it is installed.
$ sudo mdadm -V
Another way to check, which does not involve elevated privileges, is with apt and grep. This will list the program and whether it is installed or not.
$ apt list | grep mdadm
If the program is not installed, then use the following commands to install it. Please remember to update and upgrade first.
$ sudo apt update $ sudo apt upgrade $ sudo apt install mdadm
For this tutorial I am going to assume no previous arrays are present. Note removing existing RAID arrays via command line can be a bit tedious, so I will plan to discus this in a followup post.
I am going to also assume you have the needed drives already attached to your system. For RAID0 at least 2, for RAID1 at least 2, for RAID4 at least 3, for Raid 5 at least 3, for RAID6 at least 4, for RAID 10 at least 4.
I am also assuming you are aware that you will loose at least one drive capacity for any fault tolerance RAID configurations.
The only drive specification that needs to match is the type. For example, all drives must be SATA, or they all must be SAS, or they must all be IDE (if you have a very old system). For all other parameters, your RAID array performance and capacity will be determined by the wort drive, so to speak. For example if you have one 500 GB drive and two 1 TB drives, a RAID array will use all drives as if they were 500 GB, loosing half the available space on the 1 TB drives. This is why it is recommend to use similar sized drives.
Assuming you have the hardware setup properly and you have the mdadm application loaded, there are a number of steps that one needs to follow to successfully setup a RAID array from the command line on Debian.
In this example we will assume we are building a new file server. In addition to our boot drive, we have four similar storage drives to incorporate into a RAID 5 array. Earlier we mentioned that RAID5 is striped with distributed parity.
What this means is that our RAID array should be able to withstand loosing a single storage drive And if we replace that drive, it can recreate the data that was on it from the data and parity information stored on the other drives in the RAID array. However, we will loose storage equivalent to one drive to achieve this
If we are using four 32 GB drives, then our final RAID5 array storage should be (4-1)*32 GB, or about 96 GB.
Technically we can call our RAID array anything we want. However historically RAID arrays have been listed in the /dev/ directory as md0, md1, md2, etc.
As far as a mount point, many people use /mnt/ and create a directory within to mount the array too. I tend to use /mnt/ for other things, so find it incompatible with my workflow. Instead I use /srv/ to attache my RAID cat arrays. I also tend to create mount points that reflect the usage. In this case I will use /srv/rd5_0 as my mount point. When I see this I know it is the first (0) raid5 (rd5) array.
But you can do whatever you want. Just make sure to create the mount point before going any further. In my case;
$ sudo mkdir /srv/rd5_0
This is just to make sure we will not be duplicating work, or interfering with an already existing RAID array. In our case we specified a new system, so we can assume no existing RAID arrays.
However if one wanted to find an existing RAID array, that information is stored in /proc/mdstat.
$ cat /proc/mdstat
To be used in a new RAID array, the drives can not be mounted, nor can they have any partitions on them. Use the lsblk command to check this.
The drives should show no partitions and no mount points. If this is not the case then it will be necessary to un-mount them and remove the partitions.
You can use the umount command to un-mount partitions. For example
$ sudo umount /mnt/temp/
You may also want to check in the /etc/fstab and /etc/mdadm/mdadm.conf files. This will show if the drives are being auto mounted at boot, and any references to existing RAID arrays. If the files are going to be edited directly, then I suggest making backup copies first.
If there are existing partitions, any number of programs that can be used to remove them, fdisk and cfdisk come to mind. But use whatever you feel comfortable with.
If the drives have been used as part of a RAID array before, you will want to zero out their super-blocks. For example;
$ sudo mdadm --zero-superblock /dev/sda
Note if you are new to RAID arrays, it is unlikely that you will need to do most of the above. This mostly apples to storage drives of unknown origin, or used in other systems. The important thing to remember, before being used in a RAID array, the drive should be cleared of any previous partitions. If this is not the case you will most likely get an error when actually creating the array.
In our example we have four prepared drives, sdb through sde.
$ sudo mdadm --create /dev/md0 --level=raid5 --raid-devices=4 /dev/sd[b-e]
In this example the –create option specifies the device within the /dev/ directory, –level is the type of raid being created, –raid-devices is where we define what drives to use in the array. Note the shorthand being used here. This will exapnd to teh names of all four drives.
To confirm creation look for the device /dev/md0, and check that /dev/sdb through /dev/sde all now have a partition labeled md0.
When we created the RAID array, the layout should have been saved to /proc/mdstat. If you check the contents, you should see the RAID array listed as active on md0 also.
$ cat /proc/mdstat
Now that the RAID array has been created, we can format it. For this example we will use the ext4 file system.
$ sudo mkfs.ext4 /dev/md0
Making sure our mount point is created, we can use the mount command to attach the RAID Array to our directory tree. Again from our example.
$ sudo mount /dev/md0 /srv/rd5_0
To confirm the mounting use the df program with the -h (human readable option). We should see that /srv/rd5_0 size is listed as 94 G. The RAID array should be fully usable now.
$ df -h
Now we have to make sure the RAID array survives and starts up at boot. We need to create an entry in the /etc/fstab file. As always create a backup first. You can edit the file by hand our use the below bit of code to add the entry. Note the entries in single quotes are tab separated, not space separated. Use the cat command to confirm the entry is correctly written.
$ echo '/dev/md0 /srv/rd5_0 ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab $ cat /etc/fstab
In this case we need to use the tee command because we need sudo permissions to write to a file in /etc.
Next we need to update the /etc/mdadm/mdadm.conf file. We will use mdadm to generate the required information, and the same sudo tee we used earlier. The –scan option will locate the Raid Array, and the –detail option will pull the relevant data.
$ sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
Finally we need to update the initramfs system with the new information. The -u option updates the existing initramfs.
$ sudo update-initramfs -u
At this point the RAID array will survive a reboot and is ready for use.
In this post we walked through creating a RAID array form the terminal using the mdadm program. While somewhat more complex than using most GUI interfaces, it is good to be familiar with the command line tools. As good as some GUI interfaces are, they cannot cover every single option available through the command line.