Summary of Dynamo DB Core Data Modeling Concepts
Introduction
- Jason Hunter and Sean Traver from AWS presented on core data modeling concepts for DynamoDB.
- Audience had varied experience with DynamoDB, ranging from new users to experts.
- The presenters used an analogy of phone books to explain how DynamoDB stores and retrieves data.
DynamoDB Data Model
Partition Key and Sort Key
- Partition Key (PK) and Sort Key (SK) together uniquely identify a row in DynamoDB.
- Recommended to use string data types for PK and SK.
- PK should be a descriptive entity type, with SK providing more granular information.
- SK can be used to sort data within a partition.
Local Secondary Indexes (LSI) and Global Secondary Indexes (GSI)
- LSI has the same PK as the base table, but a different SK.
- GSI has a different PK and SK than the base table.
- Indexes provide alternate ways to query and access data.
- Updates propagate immediately to LSI, but eventually to GSI.
Table Design Considerations
- Start with a basic design and iterate as needed.
- Use hashing to distribute data evenly across partitions.
- Partition splits happen automatically when a partition gets too large or has high throughput.
- Maintain short attribute names to optimize item size and reduce costs.
Advanced Concepts
Soft Deletes and Archiving
- Use a GSI to track "deleted" or "archived" items.
- Shard the GSI PK to distribute the load.
Accurate Counting
- Use DynamoDB Streams and Lambda functions to count updates.
- Leverage conditional updates and string sets to de-duplicate counts.
Integrating with Amazon OpenSearch Service
- Use the DynamoDB to OpenSearch integration to enable advanced search capabilities.
- Leverage data transformation with Data Prepper.
Cost Optimization
- Compress data to reduce item size and lower costs.
- Separate frequently updated data from static data to minimize write costs.
- Utilize the new Incremental Backup feature in DynamoDB Point-in-Time Recovery to save on restore costs.
Conclusion
The presenters provided a comprehensive overview of core DynamoDB data modeling concepts, including advanced techniques and cost optimization strategies. The session aimed to benefit both new and experienced DynamoDB users.