Saturday 14 February 2015

Basics of R.A.I.D - Pros and Cons


RAID Technology

RAID stands for (originally) "redundant array of inexpensive disks" now commonly known as redundant array of independent disks". RAID technology enables multiple disk drive components into a logical unit for the purposes of data redundancy or performance improvement.

The Data is distributed across the drives in one of several ways, please referred to as RAID levels, depending on the specific level of redundancy and performance required.
 
RAID technology is used to protect databases from disk failure. It is the most common protection method used in most high-availability and clustered environments. It is the basis for a local data protection strategy. Storage performance is affected by the type of RAID used, stripe size, read/write ratio, array set placement, Multipathing architecture, and SAN fabric topology. 

 

The performance of a drive array depends on two factors:

  • Number of disks per array — As long as the controller and SCSI bus are not limiting factors, the performance of a RAID array grows linearly with the number of disks. When transmitting sequential data, the SCSI bus easily reaches its maximum bandwidth and prevents further performance gains when more disks are added. This usually does not happen in a random I/O environment.
  • RAID level — Performance is affected by both the number of disks in the array and the RAID level to which the array is set.

Types of RAID levels

Following are brief descriptions of the different Popular RAID levels.
  • RAID 0      (Striping)
  • RAID 1      (Mirroring and duplexing)
  • RAID 1+0  (Striping of RAID 1)
  • RAID 5      (distributed data guarding/Striping with Parity)
  • RAID 6      (advanced data guarding/Striping with Dual Parity)
  • RAID 5+0 & RAID 6+0
*************************

RAID 0

RAID 0 distributes all data that goes to the logical drive evenly across all physical disks in the array. The way data is spread across the drive is determined by the stripe size. 


RAID 0

RAID 0 features:

  • Extremely high read performance
  • Extremely high write performance
  • No loss of capacity
  • Loss of all data when a single drive fails




*************************

RAID 1 (Mirroring and Duplexing)

Comparing mirroring and duplexing
 

Mirroring is the simplest method of avoiding data loss when a single disk fails. During writes, data is sent to both disks. During reads, different data can be read from both disks, which provides good read Performance.

Duplexing is another form of keeping a copy of data. Where mirroring uses a single controller, duplexing can also handle a controller failure. Duplexing requires an operating system that supports RAID 1.



RAID 1

RAID 1 features:

  • Good read performance
  • Standard write performance
  • Loss of 50% capacity
  • No data loss when a single drive fails

************************* 

RAID 1+0  (Striping of RAID 1)

RAID 1+0 is a combination of RAID 1 and RAID 0. First, each data drive is mirrored to a partner disk. Then the mirrored disks are combined into a RAID 0 array. RAID 1+0 requires an even number of disks (minimum is 4 and maximum is 56). RAID 1+0 combines good performance with good data protection. In a RAID 1+0 array consisting of 10 drives, any disk can fail without data loss. If a second disk fails, the risk of losing data is only 11%. Even when a third disk fails, the chance of data loss is only 25%. Performance after a disk failure will note decrease like in a RAID 5 array. Note: RAID 1 + 0 sometimes displays as RAID 10.

RAID 1+0

 

RAID 1+0 features:

  • Very good read performance (close to RAID 0)
  • Good write performance (half of read performance)
  • Loss of 50% capacity
  • No data loss when a single drive fails, and the chance to survive multiple disk failures increases with the number of disks
Note: RAID 1+0 is the best of all RAID levels whenever a combination of performance and data protection is required. But as is typical, the best options are the most expensive.


*************************

RAID 5 (Distributed data guarding/Striping with Parity)

RAID 5 is the most popular of all RAID levels. IT combines data protection and requires only a single extra disk for data protection. Although read performance is very good, write performance is quite low because a write access to the logical drive requires four I/Os to the physical drives to update the check-sums. The overall performance of a RAID 5 array with three disks is often not better than a single disk. So, when using RAID 5, use as many disks as possible.

RAID 5 features:
RAID 5

  • Very good read performance
  • Low write performance (one quarter of read performance)
  • No data loss when a single drive fails
  • Capacity loss of one disk, independent of the number of disks in the array that are used for storing parity information
 *************************


 RAID 6 (Advanced data guarding/Striping with Dual Parity)

 RAID 6 is also known as advanced data guarding (ADG). The working principle is similar to RAID 5, but ADG uses two check-sums for data protection and is the only RAID level that can guarantee that an array is still working after the simultaneous failure of two disks. Taking care of two check-sums requires six I/Os per logical write access. RAID 6 should not be used in production environments where write performance is important. However, it is the best method to protect archived data.



RAID 6


RAID 6 features:

  • Very good read performance 
  • Write performance lower than that of RAID 5 
  • No data loss when two drives fail simultaneously
  • Capacity loss of two disks, independent of the number of disks in the array


 *************************

RAID 50 and 60

RAID 50 (RAID 5+0) and RAID 60 (RAID 6+0) are new RAID levels introduced with the new generation of array controllers. RAID 50 and 60 methods stripe the data across multiple RAID/JBOD sets with different levels of parity.


 RAID 50 (RAID 5+0)

RAID 5+0


RAID 50 (RAID 5+0) is a nested RAID method that uses RAID 0 block-level striping across RAID 5 arrays with distributed parity. RAID 50 will tolerate one drive failure in each spanned array without loss of data. RAID 50 configurations require a minimum of six drives and require less rebuild time than single RAID 5 arrays.







RAID 60 (RAID 6+0)

RAID 6+0







RAID 60 (RAID 6+0) is a nested RAID method that uses RAID 0 block-level striping across multiple RAID 6 arrays with dual distributed parity. With the inclusion of dual parity, RAID 60 will tolerate the failure of two disks in each spanned array without loss of data. RAID 60 configurations require a minimum of eight drives.
 ************************

RAID Group Comparison

Following are comparison between RAID groups with overall performance and capacity impact, below comparison will help to get decision based on your respective requirement


Features
RAID 0
RAID 1
RAID 10
RAID 5
RAID 50
RAID 6
RAID 60
Minimum Hard Drives
2
2
4
3
6
4
8
Data Protection
No Protection
Single-drive failure
Up to one disk failure in each sub-array
Single-drive failure
Up to one disk failure in each sub-array
Two-drive failure
Up to two disk failure in each sub-array
Read Performance
High
High
High
High
High
High
High
Write Performance
High
Medium
Medium
Low
Medium
Low
Medium
Read Performance (degraded)
N/A
Medium
High
Low
Medium
Low
Medium
Write Performance (degraded)
N/A
High
High
Low
Medium
Low
Low
Capacity Utilization
100%
50%
50%
67% - 94%
67% - 94%
50% - 88%
50% - 88%
Typical Applications
High End Workstations, data logging, real-time rendering, very transitory data
Operating System, transaction databases
Fast databases, application servers
Data warehousing, web serving, archiving
Large databases, file servers, application servers
Data archive, backup to disk, high availability solutions, servers with large capacity requirements
Data archive, backup to disk, high availability solutions, servers with large capacity requirements
 

  ************************

Get decide S/W RAID or H/W RAID


Types of RAID
Software-Based
Hardware-Based
External Hardware
Description
Best used for large block applications such as data warehousing or video streaming. Also where servers have the available CPU cycles to manage the I/O intensive operations certain RAID levels require.
Best used for small block applications such as transaction oriented databases and web servers.
Connects to the server via a standard adapter. RAID functions are performed on a microprocessor located on the external RAID adapter independent of the host.
Included in the OS, such as Windows and Linux. All RAID functions are handled by the host CPU which can severely tax its ability to perform other computations.
Processor-intensive RAID operations are off-loaded from the host CPU to enhance performance.
Battery-back write back cache can dramatically increase performance without adding risk of data loss.
Advantages
Low price
Data protection and performance benefits of RAID
OS independent
Only requires a standard adapter
More robust fault-tolerant features and increased performance versus software-based RAID
Build high-capacity storage systems for high-end servers


  ************************

Importance of Cache Memory in Hardware RAID Controller

Note: RAID 5/50/6/60 is the most common secure RAID level. A RAID 5 array can withstand a single disk failure without losing data or access to data. Although RAID 5 can be achieved in software but a hardware controller is recommended however data are transferred to disks by independent read and write operations (not in parallel), The data chunks that are written are also larger therefore extra cache memory is required on RAID controller to improve the write performance. It is rather obvious that during a power failure or other crash that unwritten data can be lost if RAID controller doesn’t have Cache Memory or installed Controller Cache memory isn’t battery backed up.

Please feel free to write your suggestion/feedback and ask for any further assistance on this.

Thanking you
Miten Suvagiya