• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

RAID Arrays Explained

Wait can someone explain to me Strip Size??? 64K 128K which is better?
 
Wait can someone explain to me Strip Size??? 64K 128K which is better?

Simplest way to explain this is:

When you create a raid array the disks are segmented into blocks, this is the minimum amount of space a file will take up:

So if you have block size of 64k and write a 2k file, the file will use up 64k of space, if you use block size of 128k the file will take up 128k


So why bother:

If you have a file which is 64000k is will take up 1000 blocks with 64k block size, and 500 with 128k blocks.
Now since each block has its own location on the disk with 128k block size the disk will seek 50% less

So in short:
Small block/stripe size = best use of space but more seeks on the disk (very good if you are storing lots and lots of very small files or are using a database application that makes lots and lost of small reads)
Large block/stripe size = more wasted space but saves on disk seeks (Very good if you are storing large files and want to minimise the effects of disk access times (seeks))
 
Bump for a sticky? :laugh:
 
That would have to be the best explanation on the entire internet.

Well done IggSter :)
 
You need not worry about wasted space when choosing a stripe size. Stripe ≠ cluster.

While yes the controller works with data at these sizes, the OS does not. It's not even aware that it's on an array of any sort.
 
This needs to be a sticky and moved to storage
 
Ive been saying sticky since i first wrote this.
 
Simplest way to explain this is:

When you create a raid array the disks are segmented into blocks, this is the minimum amount of space a file will take up:

So if you have block size of 64k and write a 2k file, the file will use up 64k of space, if you use block size of 128k the file will take up 128k


So why bother:

If you have a file which is 64000k is will take up 1000 blocks with 64k block size, and 500 with 128k blocks.
Now since each block has its own location on the disk with 128k block size the disk will seek 50% less

So in short:
Small block/stripe size = best use of space but more seeks on the disk (very good if you are storing lots and lots of very small files or are using a database application that makes lots and lost of small reads)
Large block/stripe size = more wasted space but saves on disk seeks (Very good if you are storing large files and want to minimise the effects of disk access times (seeks))
RAID STRIPE SIZE AND SLACK SPACE

There is a misconception with regard to space usage and stripe size. There seem to be a few people creating posts with the concern that large stripe sizes are potentially "wasteful" if there are many small files to be written to a RAID array. Such statements are woefully inaccurate, to put it mildly.

In the majority of cases stripe size has absolutely no bearing on space usage (unlike cluster size). If a "stripe" isn't fully populated (filled) by a single file then the next file, or files, will be written into that stripe's space until the stripe is completely "filled". In other words a single stripe can hold multiple files and stripes are always completely filled before disk writes continue on to the next stripe, which is on the next disk. The misconception of space being "wasted" by configuring a raid array with large sized stripes is more relevant to cluster or allocation unit size but is completely inaccurate with regard to stripe sizes. Stripes are always completely filled and if a stripe isn't filled on the first pass it will eventually be used during the next write(s) until it is filled.

For example: The assertion is that if you choose a 64kb stripe size and you store a 2kb text file then that file will be written to a stripe and no other files will be able to be stored in that stripe thereby wasting 62kb of disk space. This concept is as wrong as wrong can be.

The only incidence of disk space being used in this manner is with regard to FILESYSTEM BLOCKS. As files are stored within a disk's filesystem they are written into an available block and as each block written to becomes filled writing continues to the next available block until the file is completely written to the disk and in many instances the last block written will only be partially filled which results in "slack space". OR if a file is smaller than the available block being written to this will also result in slack space. Due to the nature of the method that files are stored on disk this slack space can not used in future writing events to the disk and is considered to be "wasted" space.

This is not the case with regard to "stripes". To the OS a RAID array appears as any other storage device and multiple filesystem blocks may reside in a single stripe element.

As an aside you will notice better performance if you manage to match the filesystem block size to the raid stripe size and only in that arrangement is it possible for a stripe to have wasted slack space.

Summary: If a stripe isn't fully populated (filled) by a single file then the next file, or files, will be placed into that stripe until it is completely "filled". In other words a single stripe can hold multiple files and stripes are always completely filled before disk writes continue on to the next stripe which is on the next disk. Stripes are always completely filled and if a stripe isn't filled on the first pass it will eventually be used during one of the corresponding write(s) until that area of disk is filled.

RAID STRIPE SIZE AND SLACK SPACE by Solarsails
 
RAID (Redundant Array of Inexpensive [or independent] Disks)

RAID-0 - Data Striping w/out parity– 2 disk minimum:
Provides improved performance to that of a single, non-RAID-0 drive, and provides additional storage space to work with. This RAID array breaks down the information stored on the hard drive into blocks which are stored on each corresponding RAID-0 hard drive.

Array size: Size of Smallest Drive x Number of Drives

Advantages: This particular array is the easiest to implement, cheapest to implement, and most all controllers will support the use of RAID-0. Can make boot times quicker and make applications load faster.

Disadvantages: Not fault tolerant. In other words if one drive fails, all data is lost.

Recommendation: Do not use it in an environment where data is of the utmost importance such as a law firm or school corporation. If you implement this array, it is HIGHLY recommended that you schedule daily or weekly backup. (Preferably every couple of days or whenever you add new data) I would not use more than four drives either because you run the risk of losing data. One is better off to install RAID-0 in an environment that where applications require a high amount of performance such as gaming or working with digital imaging. Backup is required so that way if one (or all) drives fail, you can recover from the failure.

Nice to add that RAID-0 is not really a RAID as there is no redundancy.
 
Back
Top