« Back to blog

Increasing IO Performance with RAID on EC2

When you start working with large datasets that don't fit in memory, improving IO performance on EC2 becomes a priority. To do that at BackType, we use software RAID with EBS volumes; with EBS you don't need added redundancy so you can settle with RAID0:

Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% – 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that will typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives.

Instead of adding redundancy, perform frequent point-in-time EBS snapshots. Eric Hammond created a Perl script called ec2-consistent-snapshot that can be used to take scheduled snapshots. It even performs a LOCK on your MySQL database before taking a snapshot.

Setting it up

After attaching four EBS volumes, here's how we setup RAID0 to run XFS:

yes | sudo mdadm --create /dev/md0 --chunk=256 --level 0 \
--raid-devices 4 /dev/sdh1 /dev/sdh2 /dev/sdh3 /dev/sdh4
echo DEVICE /dev/sdh1 /dev/sdh2 /dev/sdh3 /dev/sdh4 | sudo tee /etc/mdadm.conf
echo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
sudo mkfs.xfs /dev/md0
echo "/dev/md0 /volr xfs noatime 0 0" | sudo tee -a /etc/fstab
sudo mkdir /volr
sudo mount /volr 

A RAID setup like this one also allows us to exceed the 1TB limit you would have if you were running a single EBS volume. The example above could have configured an array totaling 4TBs (using four 1TB EBS volumes).

Finally, we use ec2-consistent-snapshot to take periodic snapshots of our slaves via cron:

ec2-consistent-snapshot --aws-access-key-id KEY \
--aws-secret-access-key SECRET --region us-east-1 \
--mysql-username USER --mysql-password PASS \
--xfs-filesystem /volr vol-aaaa vol-bbbb vol-cccc vol-dddd

Performance

Percona has published their findings on how various RAID configurations affect IO performance on EC2 on their MySQL Performance Blog:

Fellow YC start-up Heroku has also published some of their findings for achieving good IO performance from EBS.

by

| Viewed
times
Follow @BackTypeTech on Twitter!