DynamoDB

Paper: DynamoDB DynamoDB Summary/Abstract Amazon DynamoDB is a NoSQL cloud database service that provides consistent performance at any scale. Fundamental properties: consistent performance, availability, durability, and a fully managed serverless experience. In 2021, during the 66-hour Amazon Prime Day shopping event, 89.2 million requests per second, while experiencing high availability with single-digit millisecond performance. Design and implementation of DynamoDB have evolved since the first launch in 2012. The system has successfully dealt with issues related to fairness, traffic imbalance across partitions, monitoring, and automated system operations without impacting availability or performance. Introduction The goal of the design of DynamoDB is to complete all requests with low single-digit millisecond latencies. DynamoDB uniquely integrates the following six fundamental system properties: DynamoDB is a fully managed cloud service. DynamoDB employs a multi-tenant architecture. DynamoDB provides predictable performance DynamoDB is highly available. DynamoDB supports flexible use cases. DynamoDB evolved as a distributed database service to meet the needs of its customers without losing its key aspect of providing a single-tenant experience to every customer using a multi-tenant architecture. The paper explains the challenges faced by the system and how the service evolved to handle those challenges while connecting the required changes to a common theme of durability, availability, scalability, and predictable performance. History Design of DynamoDB was motivated by our experiences with its predecessor Dynamo. Dynamo was created in response to the need for a highly scalable, available, and durable key-value database for shopping cart data Amazon learned that providing applications with direct access to traditional enterprise database instances led to scal- ing bottlenecks such as connection management, interference between concurrent workloads, and operational problems with tasks such as schema upgrades. Service Oriented Architecture was adopted to encapsulate an application’s data behind service-level APIs that allowed sufficient decoupling to address tasks like reconfiguration without having to disrupt clients. DynamoDB took the principles from Dynamo(which was being run as Self-hosted DB but created operational burden for developers) & Simple DB, a fully managed elastic NoSQL database service, but the data model couldn’t scale to the demands of the large Tables which DDB needed. Dynamo Limitations: SimpleDB limitations: Amazon concluded that a better solution would combine the best parts of the original Dynamo design (incremental scalability and predictable high performance) with the best parts of SimpleDB (ease of administration of a cloud service, consistency, and a table-based data model that is richer than a pure key-value store) Architecture A DynamoDB table is a collection of items. ...

December 11, 2024 · Hemant Sethi

Dynamo

Paper: Dynamo Dynamo / Distributed Key Value Store Problem: Design a distributed key-value store(or Distributed Hash Table) that is highly available (i.e., reliable), highly scalable, and completely decentralized. Features Highly available Key-Value Store. Shopping Cart, Bestseller Lists, Sales Rank, Product Catalog, etc which needs only primary-key access to data. Multi-table RDBMS would limit scalability and availability. Can choose desired Level of Availability and Consistency. Background? Designed for **high availability(**at a massive scale) and partition tolerance at the expense of strong consistency. Primary Motivation for being optimized for High Availability(Over consistency) was to be always up for serving customer requests to provide better customer experience. Dynamo design inspired various NoSQL Databases, Cassandra, Riak, VoldemortDB, DynamoDB. Design Goals? Highly Available Reliability Highly Scalable Decentralized Eventually Consistent(EC) - Weaker Consistency model than Strong Consistency(Linearizability) (Notes: ) Latency Requirements? (Notes: ) Geographical Distribution of Data? Use cases Dynamo can achieve strong consistency, but it comes with a performance impact. If Strong Consistency is a requirement, Dynamo is not the best option. Applications that need tight control over the trade-offs between availability, consistency, cost-effectiveness, and performance. Services that need only Primary Key access to the data. System APIs: get(key) : T… Object, Context put(key, context, object) Dynamo treats both the object and the key as an arbitrary array of bytes (typically less than 1 MB). Uses MD5 Hashing algorithm on the key to generate 128-bit HashID, which is used to determine the storage nodes that are responsible for serving the key. High Level Architecture Agenda Data Distribution(Partitioning) Data Replication and Consistency Handing Temporary Failures(Fault Tolerance) Inter-Node communication(Unreliable Network) and Failure Detection High Availability Conflict resolution and handling permanent failures. Data Partitioning Distributing data across a set of nodes is called data partitioning. ...

December 2, 2024 · Hemant Sethi