Professional Documents
Culture Documents
R.A.I.D
Data is the most valuable asset of any business today. Lost data, in most
cases, means lost business. Even if you backup regularly, you need a fail-safe way to
ensure that your data is protected and can be accessed without interruption in the
event of an online disk failure. Adding RAID, to your storage configurations is one of
the most cost-effective ways to maintain both data protection and access. But what
is RAID?
RAID combines two or more physical hard disks into a single logical unit and
can be implemented in hardware, in the form of special disk controllers, or in
software, as a kernel module that is layered in between the low-level disk driver,
and the file system which sits above it. RAID hardware is always a "disk controller"
Hardware-RAID is a device to which one can cable up the disk drives. Usually
it comes in the form of an adapter card that will plug into a ISA/EISA/PCI/S-
Bus/MicroChannel slot. However, some RAID controllers are in the form of a box
Mladen Stefanov F48235
that connects into the cable in between the usual system disk controller, and the
disk drives. The latest RAID hardware used with the latest & fastest CPU will usually
provide the best overall performance, although at a significant price. This is because
most RAID controllers come with on-board DSP's and memory cache that can off-
load a considerable amount of processing from the main CPU, as well as allow high
transfer rates into the large controller cache. Old RAID hardware can act as a "de-
accelerator" when used with newer CPU's: yesterday's fancy DSP and cache can act
as a bottleneck, and it's performance is often beaten by pure-software RAID and
new but otherwise plain, run-of-the-mill disk controllers. RAID hardware can offer
an advantage over pure-software RAID, if it can makes use of disk-spindle
synchronization and its knowledge of the disk-platter position with regard to the
disk head, and the desired disk-block. However, most modern (low-cost) disk drives
do not offer this information and level of control anyway, and thus, most RAID
hardware does not take advantage of it. RAID hardware is usually not compatible
across different brands, makes and models: if a RAID controller fails, it must be
replaced by another controller of the same type. As of this writing (June 1998), a
broad variety of hardware controllers will operate under Linux; however, none of
them currently come with configuration and management utilities that run under
Linux.
system, which tends to simplify management. With software, there are far more
configuration options and choices, tending to complicate matters.
The first parameter is stripe width of the array. Stripe width refers to the
number of parallel stripes that can be written to or red from simultaneously. This is
equal to the number of disks in the array. That means when we adding more drives
to the array, basically we are increasing the parallelism of the array. For example if
we create array of four 160GB drives, we will have greater transfer performance
than two 320GB drives.
The second important parameter is the stripe size. Stripe size, in RAID arrays,
is the smallest allocation unit of logical disk or volume, written to each disk. It is
sometimes called block size or chunk size and values can be from 2kiB to 512kiB. The
impact of stripe size upon performance is more difficult to quantify than the effect
of stripe width but they are two main things we should know.
Decreasing Stripe Size: As stripe size is decreased, files are broken into
smaller and smaller pieces. This increases the number of drives that an average file
will use to hold all the blocks containing the data of that file, theoretically increasing
transfer performance, but decreasing positioning performance.
Increasing Stripe Size: Increasing the stripe size of the array does the opposite
of decreasing it, of course. Fewer drives are required to store files of a given size, so
transfer performance decreases. However, if the controller is optimized to allow it,
the requirement for fewer drives allows the drives not needed for a particular
access to be used for another one, improving positioning performance.
Mladen Stefanov F48235
Operation modes/states
Optimal should be normal operational mode for all RAID arrays. In this state
everything works fine, there is no failed drive, performance is not affected.
Rebuilding – this term means restoration process of the array and could be
initiated by two ways, manual or automatically. Automatic rebuilding could be
initiated immediately after hard disk fail occur if there is a dedicated disk, called hot
spare disk and this option in the configuration of array is been enabled. Otherwise
administrator must replace failed disk and start rebuilding process manually. A
Mladen Stefanov F48235
mirrored array must copy the contents of the good drive over to the replacement
drive. Rebuilding process are going to be time-consuming and also relatively slow - it
can take several hours. During this time, the array will function properly, but its
performance will be greatly diminished. The impact on performance of rebuilding
depends entirely on the RAID level and the nature of the controller, but it usually
affects it significantly. Hardware RAID will generally do a faster job of rebuilding
than software RAID. Fortunately, rebuilding doesn't happen often.
Degraded and Rebuilding state, both are critical states because there is no
data protection and array has no fault tolerance
Offers low cost and maximum performance, but offers no fault tolerance; a
single disk failure results in TOTAL data loss. Businesses use RAID 0 mainly for tasks
requiring fast access to a large capacity of temporary disk storage (such as
video/audio post-production, multimedia imaging, CAD, data logging, etc.) where in
case of a disk failure, the data can be easily reloaded without impacting the
business. There are also no cost disadvantages as all storage is usable. RAID 0 usable
capacity is 100% as all available drives are used.
Mladen Stefanov F48235
RAID 1 (Mirroring)
Provides cost-effective, high fault tolerance for configurations with two disk
drives. RAID 1 refers to maintaining duplicate sets of all data on separate disk drives.
It also provides the highest data availability since two complete copies of all
information are maintained. There must be two disks in the configuration and there
is a cost disadvantage as the usable capacity is half the number of available disks.
RAID 1 offers data protection insurance for any environments where absolute data
redundancy, availability and performance are key, and cost per usable gigabyte of
capacity is a secondary consideration.
RAID 1 usable capacity is 50% of the available drives in the RAID set.
Seldom used anymore, and to some degree are have been made obsolete by
modern disk technology. RAID-2 is similar to RAID-4, but stores ECC information
instead of parity. Since all modern disk drives incorporate ECC under the covers, this
offers little additional protection. RAID-2 can offer greater data consistency if power
is lost during a write; however, battery backup and a clean shutdown can offer the
same benefits. RAID-2 is not supported by the Linux Software-RAID drivers.
Mladen Stefanov F48235
recalculated and written along with the new data. To avoid a bottleneck, the parity
data for consecutive stripes is interleaved with the data across all disks in the array.
Data is striped across several physical drives and dual parity is used to store
and recover data. It tolerates the failure of two drives in an array, providing better
fault tolerance than RAID 5. It also enables the use of more cost-effective ATA and
SATA disks to storage business critical data. This RAID level is similar to RAID 5, but
includes a second parity scheme that is distributed across different drives and
therefore offers extremely high fault tolerance and drive failure tolerance. RAID 6
can withstand a double disk failure. RAID 6 requires a minimum of four disks and a
maximum of 16 disks to be implemented. Usable capacity is always 2 less than the
number of available disk drives in the RAID set. With less expensive, but less reliable
SATA disk drives in a configuration that employs RAID 6, it is possible to achieve a
higher level of availability. This is because the second parity drive in the RAID 6 RAID
set can withstand a second failure during a rebuild. In a RAID 5 set, the degraded
Mladen Stefanov F48235
state and/or the rebuilding time onto a hot spare is considered the window at which
the RAID array is most vulnerable to data loss. During this time, if a second disk
failure occurs, data is unrecoverable. With RAID 6 there are no windows of
vulnerability as the second parity drive protects against this.
Combines data striping from RAID 0 with data mirroring from RAID 1. Data
written in a stripe on one disk is mirrored to a stripe on the next drive in the array.
The main advantage over RAID 1 is that RAID 1E arrays can be implemented using an
odd number of disks. When using even numbers of disks it is always preferable to
use RAID 10, which will allow multiple drive failures. With odd numbers of disks,
however, RAID 1E supports only one drive failure. RAID 1E usable capacity is 50% of
the total available capacity of all disk drives in the RAID set.
Mladen Stefanov F48235
Provides the protection of RAID 5 with higher I/Os per second by utilizing one
more drive, with data efficiently distributed across the spare drive for improved I/O
access. RAID 5EE distributes the hot-spare drive space over the N+1 drives
comprising the RAID-5 array plus standard hot-spare drive. This means that in
normal operating mode the hot spare is an active participant in the array rather
than spinning unused. In a normal RAID 5 array adding a hot-spare drive to RAID 5
array protects data by reducing the time spent in the critical rebuild state. This
technique does not make maximum use of the hot-spare drive because it sits idle
until a failure occurs. Often many years can elapse before the hot-spare drive is ever
used. For small RAID 5 arrays in particular, having an extra disk to read from (four
disks instead of three, as an example) can provide significantly better read
performance. For example, going from a 4-drive RAID 5 array with a hot spare to a
5-drive RAID 5EE array will increase performance by roughly 25%. One downside of
RAID 5EE is that the hot-spare drive cannot be shared across multiple physical arrays
as with standard RAID 5 plus hot-spare. RAID 5 technique is more cost efficient for
multiple arrays because it allows a single hot-spare drive to provide coverage for
multiple physical arrays. This configuration reduces the cost of using a hot-spare
drive, but the downside is the inability to handle separate drive failures within
different arrays. This RAID level can sustain a single drive failure. RAID 5EE useable
capacity is between 50% - 88%, depending on the number of data drives in the RAID
set. RAID 5EE requires a minimum of four disks and a maximum of 16 disks to be
implemented.
Mladen Stefanov F48235
NESTED(hybrid) RAIDs
RAID 10 (Striping and mirroring)
Combines RAID 0 striping and RAID 1 mirroring. This level provides the
improved performance of striping while still providing the redundancy of mirroring.
RAID 10 is the result of forming a RAID 0 array from two or more RAID 1 arrays. This
RAID level provides fault tolerance - up to one disk of each sub-array may fail
without causing loss of data. Usable capacity of RAID 10 is 50% of available disk
drives.
RAID 50 (Striping)
a single large RAID 5 array. Usable capacity of RAID 50 is between 67% - 94%,
depending on the number of data drives in the RAID set.
Combines multiple RAID 6 sets with RAID 0 (striping). Dual parity allows the
failure of two disks in each RAID 6 array. Striping helps to increase capacity and
performance without adding disks to each RAID 6 array (which would decrease data
availability and could impact performance in degraded mode).
Mladen Stefanov F48235
Obviously there is no optimal RAID level, stripe size and RAID implementation,
software or hardware. Everything is in accordance of current circumstances and
available resources .
Last month new supermicro server replaced old tree upper futjitsu servers
and old tape device.
Mladen Stefanov F48235
New Supermicro server with 24 sets enclosure. 18x2TB western digital and
1x2Tb global hot spare disk in RAID6 for storage. 2x136GB SAS drives in RAID1 for
operating system and applications