Server Fault Asked by Nicolas De Jay on November 4, 2021
We are setting up an ADAPT0 (RAID-60-like) configuration for a file server.
We have six disk pools. Each consists of 14 disks and is set up using ADAPT. According to Dell’s official white paper, ADAPT is similar to RAID 6 but distributes spare capacity. On page 13, it is indicated that the chunk size is 512 KiB and that the stripe width is 4 MiB (over 8 disks) for each disk pool.
My understanding is that for each 14 disk pool, 2 disks worth of capacity is reserved for spare, 20% of the remaining 12 disks (2.4 disks worth of capacity) is used for parity and 80% (9.6 disks) is used for storage. However, the chunk size is 512 KiB and the stripe width remains 4MiB since we are only writing to 8 disks in one contiguous block.
To achieve an ADAPT0 (RAID-60-like) configuration, we then created a logical volume that stripes over two disk pools using LVM. Our intent is to eventually have 3 striped volumes, each striping over two disk pools. We used a stripe size that matches that of the hardware RAID (512 KiB):
$ vgcreate vg-gw /dev/sda /dev/sdb
$ lvcreate -y --type striped -L 10T -i 2 -I 512k -n vol vg-gw
Next, set up an XFS file system over the striped logical volume. Following guidelines from XFS.org and a few other sources, we matched the stripe unit su
to the LVM and RAID stripe size (512k) and set the stripe width sw
to 16 since we have 16 "data disks".
$ mkfs.xfs -f -d su=512k,sw=16 -l su=256k /dev/mapper/vg--gw-vol
$ mkdir -p /vol/vol
$ mount -o rw -t xfs /dev/mapper/vg--gw-vol /vol/vol
We benchmarked sequential I/O performance of 4KiB block sizes on /dev/sda
and /dev/sdb
and /dev/mapped/vg--gw-vol
using
fio --name=test --ioengine=posixaio --rw=rw --bs=4k --numjobs=1 --size=256g --iodepth=1 --runtime=300 --time_based --end_fsync=1
We were surprised to obtain similar performances:
Volumes Throughput Latency
--------------------- ---------- ----------
/dev/sda 198MiB/s 9.50 usec
/dev/sdb 188MiB/s 10.11 usec
/dev/mapped/vg--gw-vol 209MiB/s 9.06 usec
If we use the I/O monitoring tool bwm-ng
, we can see I/O to both /dev/sda
and /dev/sdb
when writing to /dev/mapped/vg--gw-vol
.
Did we configure properly? More specifically:
(1) Was it correct to align the LVM stripe size to that of the hardware RAID (512 KiB)?
(2) Was it correct to align the XFS stripe unit and widths as we have (512 KiB stripe size and 16 data disks), or are we supposed to "abstract" the underlying volumes (4 MiB stripe size and 2 data disks)?
(3) Adding to the confusion is the self-reported output of the block devices here:
$ grep "" /sys/block/sda/queue/*_size
/sys/block/sda/queue/hw_sector_size:512
/sys/block/sda/queue/logical_block_size:512
/sys/block/sda/queue/max_segment_size:65536
/sys/block/sda/queue/minimum_io_size:4096
/sys/block/sda/queue/optimal_io_size:1048576
/sys/block/sda/queue/physical_block_size:4096
Thank you!
I would avoid inserting a RAID0 layer on top of ADAPT. Rather, I would create a simple linear LVM pool comprising the two arrays or, alternatively, create a single 28 disks array (not utilizing the second controller at all).
If a linear LVM concatenation of the two arrays, XFS will give you added performance by the virtue of its own allocation group strategy (due to the filesystem concurrently issuing multiple IOs to various LBA ranges).
However, a single 28 disks pool should provide slightly better space efficiency due to less total spare capacity vs user data.
Regarding XFS options, you should use su=512k,sw=8
based on ADAPT layout. Anyway, with high end controller equipped with large powerloss-protected write cache, this should have a minor effect.
Answered by shodanshok on November 4, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP