TalksAWS re:Invent 2025 - Advanced data modeling with Amazon DynamoDB (DAT414)

AWS re:Invent 2025 - Advanced data modeling with Amazon DynamoDB (DAT414)

Advanced Data Modeling with Amazon DynamoDB

Key Characteristics of DynamoDB

Fully Managed Service

  • DynamoDB is a fully managed, highly available, and scalable NoSQL database service provided by AWS.
  • It has a regional, multi-tenant, self-healing fleet of storage nodes, load balancers, and request routers that ensure high availability and consistent performance.

Consumption-Based Pricing

  • DynamoDB uses a consumption-based pricing model, where you pay for the read and write capacity units (RCUs and WCUs) consumed, as well as the storage used.
  • This pricing model provides predictable billing impacts, allowing you to estimate costs accurately when making changes to your data model.

Consistent Performance at Any Scale

  • DynamoDB provides consistent, single-digit millisecond latency performance, regardless of the table size, from megabytes to petabytes.
  • This is achieved through DynamoDB's partitioning and distribution of data across multiple storage nodes, which allows for constant-time access to your data.

Understanding DynamoDB Data Modeling

Primary Keys

  • DynamoDB tables require a primary key, which can be either a simple primary key (partition key only) or a composite primary key (partition key and sort key).
  • The primary key must be unique for each item in the table and is used to distribute and access data efficiently.

Partitioning and the DynamoDB API

  • DynamoDB automatically partitions your data across multiple storage nodes based on the partition key.
  • The DynamoDB API provides single-item actions (CRUD operations) that require the full primary key, as well as query operations that allow you to fetch multiple items with the same partition key.

Secondary Indexes

  • Secondary indexes in DynamoDB provide additional read-based access patterns by creating a fully managed copy of your data with a new primary key.
  • There are two types of secondary indexes: global secondary indexes (GSIs) and local secondary indexes (LSIs).

Data Modeling Goals and Process

Maintaining Data Integrity

  • Ensure a valid schema in your application code to maintain the integrity of the data you're saving.
  • Enforce constraints and handle data consistency when duplicating data across items.

Enabling Efficient Data Access

  • Design your primary keys to uniquely identify and efficiently access the relevant data.
  • Use secondary indexes to enable additional read-based access patterns.

Keeping It Simple

  • Avoid over-complicating your data model and stick to the basics of DynamoDB's partitioning and API.
  • Understand your access patterns upfront and model your data accordingly.

Evolving Your DynamoDB Schema

Schema Changes Not Affecting Data Access

  • Adding new, unindexed attributes to your table is the easiest type of schema evolution, as DynamoDB is schema-less.
  • You can handle these changes entirely within your application code.

Adding New Indexes

  • You can add new global secondary indexes (GSIs) to your DynamoDB table at any time, and DynamoDB will automatically backfill the index for you.
  • This allows you to enable new read-based access patterns without modifying your existing data.

Changing Existing Data

  • If you need to add a new attribute or index that requires updating existing data, you'll need to perform a backfill operation.
  • Tools like AWS Glue, AWS Step Functions, and the DynamoDB Bulk Executor can help automate and simplify this process.

Anti-Patterns and Best Practices

Avoiding the "Kitchen Sink" Item Collection

  • Only group different entity types in the same item collection if you have at least one access pattern that requires fetching them together.
  • Prefer to have separate tables or item collections for unrelated entities.

Embracing the DynamoDB API

  • Avoid hiding the DynamoDB API behind abstraction layers, as this can lead to performance issues and suboptimal usage of the service.
  • Leverage the full capabilities of the DynamoDB API, such as atomic updates and conditional writes, to optimize your data model.

Managing Item Size and Transactions

  • Keep item sizes small to minimize costs and improve performance.
  • Use transactions sparingly, only for low-volume, high-value operations where data consistency is critical.

Key Takeaways

  • Understand DynamoDB's unique characteristics, including its partitioning, API, and consumption-based pricing, to design an effective data model.
  • Focus on maintaining data integrity and enabling efficient data access through thoughtful primary key and secondary index design.
  • Leverage DynamoDB's schema flexibility to evolve your data model over time, using tools to automate backfill operations when necessary.
  • Avoid anti-patterns like the "kitchen sink" item collection and embrace the DynamoDB API to optimize your application's performance and cost-effectiveness.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.