AWS EBS and reliability/durability

EBS stands for Elastic Block Store and it provides block level storage volumes for use with Amazon EC2 instances. Amazon EBS volumes are designed to be highly available and reliable. As per Amazon EBS site:

The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% – 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that will typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives.

Couple of points to note about reliability and durability:

  1. Sometimes people complain about losing data on EBS etc. First of all no system is the world can provide completely durable storage system. So it is important to understand what the service is claiming to provide and what it is providing.
  2. EBS provides redundancy by replicating data on multiple hard disks and that increases the data durability in case one disk fails. In case on EBS an AFR of 0.1-0.5% means that in 200 years (Assuming AFR 0.5%) you will lose data once if you have once instance with approx 20GB changing data.
  3. You can increase durability by taking frequent snapshot to S3. But you may still lose data between the last snapshot time and current time.
  4. This is not as bad as it sounds. If you are running a content site (e.g. blog) and have some sort of backup for the blog you are writing, you can manage with a snapshot (or your own backup) which happens daily. In case you lose EBS volume, you may have to recreate recent few blogs. If you are handling critical stuff like transactions, it may be useful to store transaction redo log on S3 to be more durable.
  5. Also note that technically S3 is also not 100% durable. But for all practical purpose you can consider it to be completely durable. It provides a durability of 99.9999999%.

What you should consider before designing your system:

  1. EBS clearly needs to be backed up on S3. The frequency depends on the kind of site your are running and what window loss you can afford.
  2. So for important transaction data, you may want to keep redo log on S3 before showing confirmation message to your customer. Remember technically you can always lose a committed data on mysql or some other db on EBS as well. So a commit is on a storage is at best as reliable/durable as the storage device itself.
  3. You may also accidentally kill instance from AWS console due to human errors. If you have micro instance, then it kills attached EBS also. In case of other instances, it provides option to not kill attached EBS. So its a good idea to take backup more frequently to handle such accidental instance termination.
Share this article: share on Google+ share on facebook share on linkedin tweet this submit to reddit

Comments

Click here to write/view comments