Thursday, September 8, 2011

The Story of RAID

If you’ve got only one hard drive in your server, feel free to skip ahead. Otherwise, let’s talk about putting those extra drives to use. The acronym RAID stands for redundant array of inexpensive disks, although if you’re a businessperson, you can substitute the word independent for inexpensive. We forgive you. And if you’re in France, RAID is short for recherche assistance intervention dissuasion, which is an elite commando unit of the National Police—but if that’s the RAID you need help with, you’re reading the wrong book. We think RAID is just a really awesome idea for data: When dealing with your information, it provides extra speed, fault tolerance, or both.

At its core, RAID is just a way to replicate the same information across multiple physical drives. The process can be set up in a number of ways, and specific kinds of drive configurations are referred to as RAID levels. These days, even low- to mid-range servers ship with integrated hardware RAID controllers, which operate without any support from the OS. If your new server doesn’t come with a RAID controller, you can use the software RAID functionality in the Ubuntu kernel to accomplish the same goal.

Setting up software RAID while installing your Linux system was difficult and unwieldy only a short while ago, but it is a breeze these days: The Ubuntu installer provides a nice, convenient interface for it and then handles all the requisite backstage magic. You can choose from three RAID levels: 0, 1, and 5.

RAID 0 A so-called striped set, RAID 0 allows you to pool the storage space of a number of separate drives into one large, virtual drive. The important thing to keep in mind is that RAID 0 does not actually concatenate the physical drives—it actually spreads the data across them evenly, which means that no more space will be used on each physical drive than can fit on the smallest one. In practical terms, if you had two 250GB drives and a 200GB drive, the total amount of space on your virtual drive would equal 600GB; 50GB on each of the two larger drives would go unused. Spreading data in this fashion provides amazing performance but also significantly decreases reliability. If any of the drives in your RAID 0 array fail, the entire array will come crashing down, taking your data with it.

RAID 1 This level provides very straightforward data replication. It will take the contents of one physical drive and multiplex it to as many other drives as you’d like. A RAID 1 array does not grow in size with the addition of extra drives—instead, it grows in reliability and read performance. The size of the entire array is limited by the size of its smallest constituent drive.

RAID 5 When the chief goal of your storage is fault tolerance, and you want to use more space than provided by the single physical drive in RAID 1, this is the level you want to use. RAID 5 lets you use n identically sized physical drives (if different-sized drives are present, no more space than the size of the smallest one will be used on each drive) to construct an array whose total available space is that of n–1 drives, and the array tolerates the failure of any one—but no more than one—drive without data loss.

Which RAID to Choose? If you’re indecisive by nature, the past few paragraphs may have left you awkwardly hunched in your chair, mercilessly chewing a No. 2 pencil, feet tapping the floor nervously. Luckily, the initial choice of RAID level is often a no-brainer, so you’ll have to direct your indecision elsewhere. If you have one hard drive, no RAID for you. Do not pass Go, do not collect $200. Two drives? Toss them into RAID 1, and sleep better at night. Three or more? RAID 5. Unless you really know what you’re doing, avoid RAID 0 like the plague. If you’re not serving mostly read-only data without a care about redundancy, RAID 0 isn’t what you want.

The Mythical Parity Drive
If you toss five 200GB drives into a RAID 5 array, the array’s total usable size will be 800GB, or that of four drives. This makes it easy to mistakenly believe that a RAID 5 array “sacrifices” one of the drives for maintaining redundancy and parity, but this is not the case. Through some neat mathematics of polynomial coefficients over Galois fields, the actual parity information is striped across all drives equally, allowing any single drive to fail without compromising the data. Don’t worry, though. We won’t quiz you on the math.

Other RAID Modes
Though the installer offers only the most common RAID modes—0, 1, and 5—many other RAID modes exist and can be configured after the installation. Take a look at http://en.wikipedia.org/wiki/RAID for a detailed explanation of all the modes.

Source of Information : Prentice Hall The official Ubuntu Book 5th Edition 2010

No comments: