My New Year resolutions finally started to come good recently, when I was allowed to buy 3 x 2TB drives for my desktop. I expected that fitting them into the PC case itself would be a pain (I never seem to have enough SATA power connectors, though SATA data cables are practically coming out of my ears! How can there possibly be such a mis-match??!), but I hadn’t expected it to be quite such a trial turning them into a RAID5 system that my Linux distro could recognise and use. But it was… or, at least, it felt like it at the time!
First, it’s “fake raid”. I’m too cheap to buy a real hardware RAID card -and besides, I’d never get it past the Household Budget Watch Committee of One. So, it’s there in the motherboard’s BIOS… a quick F1 on boot-up, a button-press here, and F10 there… bingo, I have 4TB of usable storage and I can afford a hard disk failure. Nice.
Now, boot up Linux and check with the Gparted tool: bummer. There are /dev/sdb, /dev/sdc and /dev/sdd, each identified separately as a 2TB drive (well, OK… 1.82TiB, but that’s inflation for you). But they’re all separate from each other, and there’s no apparent understanding that, actually, all three are doing teamwork now. Luckily for me, the problem is in Gparted, not my hard disks (or even my fake raid setup): it simply doesn’t “do” fake raid.
The good news, however, is that the tools which *can* do fake raid are available -and are, in fact, probably already installed on your distro. The key one is dmraid. If you just type sudo dmraid (or become root and type dmraid), you’ll know soon enough if it’s installed: you should get an error message complaining that no arguments or options have been given. If instead you are told “command not found”, then it’s not installed and you’ll have to install it using your distro’s package manager and (if I were doing it, a reboot afterwards). The other tool you’ll need to set things up properly is parted. That’s not Gparted, the graphical tool which doesn’t understand fake raid, but its command line cousin which does.
So: assuming that dmraid’s been installed and is running (it runs by default in all later editions of Ubuntu and its derivatives, for example), you’ll first need to know under what name your fake raid device has been detected. If you do ls /dev/mapper/*, you should see a weird device name listed there. Mine happened to be isw_ifdbedffj_safedata, and I recognised this to be my fake raid because the name “safedata” was one I’d assigned in the BIOS setup screen when creating the array in the first place.
Now that you know the device name, you can partition it. In the old days of peanut-sized hard disks, you’d have done something like fdisk /dev/sda to begin the process of partitioning the sda1 hard disk. Try that now, however, and you’ll be in trouble because (a) fdisk doesn’t like working with large hard drives and (b) /dev/sda isn’t the right device name! Instead, you work with the parted tool to set up partitions on (in my case) /dev/mapper/isw_ifdbedffj_safedata. It trips less easily off the fingers and keyboard, that’s for sure! But at least it will work. Here’s what I did:
sudo parted /dev/mapper/isw_ifdbedffj_safedata
Warning: The existing disk label on /dev/mapper/isw_ifdbedffj_safedata will be destroyed
and all data on this disk will be lost. Do you want to continue?
mkpart primary ext4 4 -1
align-check optimal 1
name 1 safedata
The mklabel gpt command there causes this large volume to be created as a GUID-partition table drive (as opposed to a more-usual Master Boot Record one, which can’t cope with volume sizes much larger than 2TB). This is something we’re probably going to have to get used to now that 3TB disks are available for quite reasonable sums!
The other interesting command in that lot of gibberish was this one: mkpart primary ext4 4 -1. From the name, you can probably guess this is the command that is actually making or creating the partition. I wanted a single volume of 4TB in size, so I’m creating a single primary partition which will eventually use ext4 as its file system. The tricky bit is those last 2 numbers. They tell parted where the new partition should start and stop, expressed as offsets from the disk’s “inner track”, with ‘-1′ having the special meaning of ‘keep going until you run out of disk platter!’. My code, for example, says “start at the 4MB mark and continue until the end of the disk”. Which probably prompts the next obvious question: why start at 4MB? Why not at 0?
Well, here’s the message I got when I did start at 0:
(parted) mkpart primary ext4 0 -1
Warning: The resulting partition is not properly aligned for best performance.
I’m afraid we’re talking about that hoary old chestnut, partition boundary alignment. Your raid array has a stripe size; the volume is created of clusters; if the partition boundaries aren’t aligned right, then the one can cross over the other and have the effect of causing what ought to have been one I/O operation to become two. Windows suffers from the same thing, incidentally, and there’s even an article available on the issue (that probably explains it better than I just did!). Long story cut short, therefore: by
skill and profound insight luck, I found that sacrificing the first 4MB of my hard disk allowed my partition boundaries to align correctly (and thus give me a substantial performance boost for nearly nothing). The number you’d have to sacrifice to achieve the same thing will depend entirely on your stripe size, cluster size and (probably) the wind direction that day… so experiment. The align-check command you see me do simply gets parted to confirm that the newly-created partition really is properly aligned.
Once parted has done its work, it’s relatively easy to format the new partition with a new file system. I say “relatively” there, only because the formatting options for the ext4 file system are a pain in the neck! Here’s the command I issued:
sudo mkfs -t ext4 -m 0 -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super -L safedata /dev/mapper/isw_ifdbedffj_safedata
Nice! The main parts of interest here is that the command mkfs is being applied to the correct device (i.e., /dev/mapper/isw_ifdbedffj_safedata); I’m giving the resulting file system a label (that’s the -L bit) of “safedata”, too; and I’m making sure the file system uses extents and a journal (extents makes it fast, a journal makes it safe). What the other options are doing… well, that’s what documentation is for!
Incidentally, when I first issued that command, I was told “/dev/mapper/isw_ifdbedffj_safedata is apparently in use by the system; will not make a filesystem here!” Quite how a disk volume with no file system could actually be in use by the system, I haven’t the faintest idea… but a reboot cured the problem and allowed me to format the thing without a problem. (I realise this is very much the Windows User approach to Linux difficulties, but there are times when switching the thing off and on again actually works!)
Finally, it’s time to mount the new file system -for which, of course, you need a mountpoint. I also like to ensure I assign ownership and permissions on the drive once it’s been mounted:
sudo mkdir /data
sudo mount /dev/mapper/isw_ifdbedffj_safedata /data
sudo chown -R hjr:users /data
sudo chmod -R 775 /data
And if that all works, you polish things off by editing /etc/fstab so that the new volume is re-mounted automatically every time the PC restarts. Fstab edits can get clever, sexy (sort-of) and convoluted… but I kept mine very short and to-the-point:
/dev/mapper/isw_ifdbedffj_safedata /data ext4 defaults 0 0
Another reboot to check the thing actually does what it says on the tin, and we’re (finally!) sorted.