Like the Google File System and Google Bigtable papers, this Amazon paper describes a critical part of Amazon's infrastructure, a reliable, fast, and scalable storage system.
Some excerpts from the paper:
There are many services on Amazon's platform that only need primary-key access to a data store ... [and] using a relational database would lead to inefficiencies and limit scale and availability.The paper overall is fascinating, not only for the discussion of Amazon's needs and how the authors thought about them when building Dynamo, but also for critical discussion of other distributed databases. The related work section (Section 3) is worth its weight in gold.
Dynamo uses a synthesis of well known techniques to achieve scalability and availability: Data is partitioned and replicated using consistent hashing, and consistency is facilitated by object versioning. The consistency among replicas during updates is maintained by a quorum-like technique and a decentralized replica synchronization protocol. Dynamo employs a gossip based distributed failure detection and membership protocol ... Storage nodes can be added and removed from Dynamo without requiring any manual partitioning or redistribution.
In the past year, Dynamo has been the underlying storage technology for a number of the core services in Amazon's ecommerce platform.
A paper on a Amazon distributed database immediately invites comparison with the Google Filesystem and Bigtable. Unlike Google FS (and also unlike distributed databases like C-store), Amazon's Dynamo oddly is optimized for writes. From the paper:
Many traditional data stores execute conflict resolution during writes and keep the read complexity simple.This choice surprises me and does appear to lead to some problems later. The 15ms average read latency seems high if any service needs to make more than a dozen database queries when serving a web page; latencies like that make this a pretty heavyweight data store service.
Dynamo targets the design space of ... a data store that is highly available for writes ... We push the complexity of conflict resolution to the reads in order to ensure that writes are never rejected.
As they described in the paper, at least a few groups at Amazon needed a much lighter weight service that could be hit thousands of times, so they had to use rather extreme parameters (requiring all replicas be available for writes, and only one for reads) to force Dynamo to work for them. With those parameters, they effectively turned off the high availability for writes and pushed the complexity of conflict resolution back away from reads, which makes me wonder if write optimization really should have been part of Dynamo's design goals in the first place.
Another difference with Google FS and Bigtable is that Google's systems organize data as a map, supporting high performance range scans over the data. On the one hand, Google may have more need for that, with its building of search indexes and analyzing massive text and log data. On the other hand, Amazon has massive text and log data too, and Dynamo seems like it may not be able to help with large scale data indexing and analysis tasks.
On both not supporting range scans and the optimization for writes over reads, the source of that appears to be that the authors focused on the needs of the shopping cart. They repeatedly return to that as a motivating example. It is not clear to me why they choose to focus on that task over their other needs.
I had a couple other surprises. Dynamo relies on random, uniform distribution of the keys for load balancing -- something that seems likely to run into problems with highly skewed access patterns -- rather than supporting additional replication of frequently accessed data. More serious, Dynamo is limited to a few hundred nodes because they punted on some of the hard problems of ensuring consistency in metadata (like their routing table) at larger scale.
Overall, a very interesting paper and system from Amazon. I love how Amazon has adapted the motivation of P2P distributed hash tables like Chord and Pastry to an environment with all trusted machines like an Amazon data center, taking advantage of that to reduce latency and improve reliability. I also am impressed by how remarkably configurable Dynamo is -- from the underlying storage to the number of replicas to the means of conflict resolution -- so that it can adapt to the widely varying needs of different groups at Amazon.
By the way, unlike Google, Yahoo, and Microsoft, Amazon publishes academic papers only rarely. They deserve kudos for doing so here. With this paper, Amazon is revealing some of the remarkable challenges in large scale computing they face. As people are attracted to those challenges, perhaps this will be the start of more openness from Amazon.