TalksAWS re:Invent 2025 - Advanced data modeling for Amazon ElastiCache (DAT438)

AWS re:Invent 2025 - Advanced data modeling for Amazon ElastiCache (DAT438)

Advanced Data Modeling for Amazon ElastiCache (DAT438)

Introduction to Amazon ElastiCache and Valhalla

  • Amazon ElastiCache is a fully managed in-memory data store service that provides microsecond response times.
  • ElastiCache supports three open-source engines: Redis, Memcached, and Valhalla.
  • Valhalla is an open-source high-performance key-value data store that was forked from Redis.
  • The presentation will focus on using the Valhalla engine to build a highly scalable Massively Multiplayer Online (MMO) game application.

Caching and Lazy Loading

  • Caching with ElastiCache can significantly improve application performance by serving data from memory instead of a slower persistent database.
  • Lazy loading is a caching strategy where data is fetched from the database and cached on-demand when there is a cache miss.
  • Invalidation can be used to keep the cache fresh by triggering a Lambda function to update the cache when data changes in the database.
  • Time-to-live (TTL) can be used to automatically evict stale data from the cache, and a background task can be used to update the TTL for hot items before expiration.

Avoiding the Thundering Herd Problem

  • The thundering herd problem occurs when multiple processes/threads try to fetch the same expired cache item, leading to high contention and pressure on the backend database.
  • A synchronization mechanism can be implemented using Valhalla's NX (only create if not exists) and EX (expire after) arguments to ensure only one client fetches the data from the database and populates the cache.
  • Other clients wait for the lock to be released before fetching the data from the now-populated cache.

Client-Side Caching

  • Client-side caching can be used to further reduce latency by serving requests from a local cache on the client.
  • Clients can maintain a connection pool with a dedicated "invalidation" connection to receive notifications when cache items are updated, allowing the local cache to be invalidated.
  • This approach uses Valhalla's pub/sub functionality to efficiently manage cache invalidation.

Data Structures in Valhalla

  • Hash data structures are useful for storing session information, providing fast random access.
  • JSON data structures are suitable for storing more complex, nested data, with the ability to filter and update elements using JSON paths.
  • The choice between hash and JSON depends on the complexity of the data being stored.

Semantic Caching with Vector Search

  • Semantic caching uses vector embeddings to capture the semantic relationships between data, enabling contextual search and retrieval.
  • The process involves ingesting data, splitting it into chunks, and converting it to vector representations.
  • Valhalla's vector search functionality allows for efficient retrieval of semantically similar data, even when the query is not an exact match.
  • This is demonstrated through a chatbot use case, where the vector search can understand the intent behind similar queries and return relevant responses.

Pub/Sub Messaging

  • Valhalla's pub/sub functionality can be used to build an ephemeral in-game chat feature, where users can publish and subscribe to chat room topics.
  • Valhalla supports both classic pub/sub (where messages are forwarded to all nodes) and sharded pub/sub (where messages are routed to specific shards for better scalability).
  • The example demonstrates how Alice and Bob can subscribe to a "Shadow Dragons" chat room topic and publish/receive messages.

Probabilistic Data Structures

  • Hyperloglog is a probabilistic data structure that can approximate the cardinality (unique count) of a set with a small, constant-sized data structure and less than 1% error rate.
  • This is useful for tracking unique daily active users in the game, providing significant memory savings compared to a traditional set data structure.
  • Bloom filters are another probabilistic data structure that can efficiently test membership, allowing for over 90% memory usage savings compared to a set, with a tunable false positive rate.

Geospatial Queries

  • Valhalla's geospatial commands can be used to store and query location data, such as the positions of players and points of interest in the game world.
  • The data is stored using a geohash algorithm, which maps latitude and longitude coordinates to a compact integer representation that preserves spatial proximity.
  • This allows for efficient bounding box and radius-based queries to find nearby objects, using Valhalla's sorted set data structure.

Rate Limiting

  • Rate limiting can be implemented in Valhalla using a counter data structure and a Lua script to perform atomic operations.
  • A simple rate limiter can be built to allow a fixed number of requests within a time window, resetting the counter when the time expires.
  • A more advanced token bucket rate limiter can be implemented, where the bucket has a capacity and a refill rate, allowing for more granular control over resource usage.

Key Takeaways

  • Valhalla, as part of Amazon ElastiCache, provides a rich set of data structures and features that can be leveraged to build highly scalable and performant applications.
  • Techniques like lazy loading, cache invalidation, client-side caching, and probabilistic data structures can significantly improve application performance and reduce resource usage.
  • Valhalla's pub/sub, geospatial, and rate limiting capabilities enable the development of complex, real-time features for applications like MMO games.
  • The presented examples demonstrate how Valhalla's advanced data modeling capabilities can be applied to solve common challenges in building scalable, low-latency applications.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.