The problem: I have run out of storage space. Four 200GB parallel ATA hard drives configured as separate, single drives, is a management nightmare, and for the past six months or so, I have been shuffling data back and forth between these drives, trying to maintain some sort of order while using up increasingly smaller bits of free space. I eventually ended up with 3 of the drives having 0 bytes of free space, and the fourth drive so full it would not hold my latest batch of photos. Something needed to be done.
The goals of this project were twofold. First, to increase available storage space. Second, to introduce some sort of RAID because the prospect of losing huge swaths of data when one of those drives kicked the bucket was looking increasingly worrisome.
Since this is a personal file server, I wanted it to be fairly cheap. Performance really isn't a big concern for me. The files I'm hosting are primarily media (audio, video, and photos), with file sizes in the megabyte and up range. Files of this size do not stress filesystem performance, nor I/O performance, so it should be easy to meet my goals. Software RAID 5 seemed to offer the promise of cheap, reasonably fast storage space able to withstand the death of any single drive. Perfect.
I've filled 800GB of space, so I need more than that. At the same time, I don't want to go overboard building a multi-terabyte server that will take years to fill. 900GB is a reasonable place to start, and with 300GB hard drives at the ideal price point, I purchased four, which would give me 1200GB of raw disk space, or 900GB after one disk worth of parity information. Since this represents just over 12% more storage space, the ability to expand my array at a later date is a critical component. Luckily, this functionality is just now being added to the linux software RAID driver, called md. The array I create today can be expanded tomorrow with a simple kernel upgrade. So far, so good.
Routing cables for 4 IDE drives was painful enough, and Parallel ATA is a dying technology anyway, so I opted to use Serial ATA drives. This would require a new motherboard. Since I planned on starting with 4 drives and wanted room for expansion, I needed a board with more than 4 SATA ports. The MSI K8N Neo4 Platinum caught my eye. It's cheaper than I generally like for a motherboard, but so far has proved to be very solid. The motherboard upgrade then necessitated a CPU and RAM upgrade, since the components in my old server are 5 years old and are not compatible. The cheapest CPU that the motherboard would accept, and a 1GB stick of Crucial RAM rounded out the basic hardware:
- MSI K8N Neo4 Platinum motherboard
- AMD Athlon64 3000+ CPU
- Crucial 1GB DDR1-400 DIMM
- 4 x Seagate 7200.9 300GB SATA HDDs
- Fedora Core 5 X86-64 linux distribution
- md driver / mdadm (no raidtools package)
Fedora Core has fairly respectable RAID setup built right into the installer, but for various reasons I chose to set up the array after I had installed the OS. With the 4 hard drives plugged into the first four SATA ports (/dev/sda through /dev/sdd), array creation was a breeze:
When creating a raid5 array, mdadm does not generate parity information across all disks in the array by reading data stripes and then writing the parity stripe, as you might expect. Instead, it creates a degraded array with one fewer drives than you specify, and then adds the last drive as a spare. This causes the array to be rebuilt, and the spare gets all the writes. This way is about twice as fast, but it makes the initial array creation look like it's building a raid4 array rather than a raid5. Be forewarned.
With these 4 drives (speed is primarily dependent on drive size rather than number of devices), the initial array creation took about 3 hours, though the array was usable during creation. I formatted the array, copied my old data over to it, and was up and running in no time. Not a moment too soon, as it happened. While I was preparing to erase and repurpose the drives from the old file server, I had a bearing sieze in one of the drives, rendering it inoperable.