TalksAWS re:Invent 2025 - Better, faster, cheaper: How Valkey is revolutionizing caching (DAT458)

AWS re:Invent 2025 - Better, faster, cheaper: How Valkey is revolutionizing caching (DAT458)

Valky: Revolutionizing Caching Performance and Reliability

Introduction to Valky

Origin of Valky

Valky is an open-source, BSD-licensed data store for high-performance key-value workloads
It was created in 2024 as a community-driven fork of the last open-source version of Redis (7.2) after Redis changed its licensing
Valky is backed by the Linux Foundation and has seen rapid growth, with two major releases in the first year and a half

Valky Governance and Adoption

Valky has a multi-vendor governance model, not controlled by a single vendor
Over 50 organizations contributing, 150+ code contributors, and 10+ managed service providers after 1.5 years
Tremendous community momentum, with over 1,000 commits and high container pull numbers

Performance Improvements with Multi-Threaded Architecture

Challenges with Spiky Caching Workloads

Many customers struggle with provisioning enough capacity to handle spikes in traffic and workloads
Customers often have to choose between scaling up or scaling out, leading to complex application decomposition

Limitations of Single-Threaded Valky 7.2

Valky 7.2 had a single-threaded architecture to simplify command execution and avoid race conditions
Analysis showed that a significant portion of CPU time was spent on I/O-heavy tasks like reading/writing to clients

Multi-Threaded Architecture in Valky 8.0

Valky 8.0 introduced a multi-threaded architecture, offloading I/O-heavy tasks to separate threads
This allowed the main execution thread to remain single-threaded while leveraging more CPU cores for higher throughput
Benchmarks showed up to 230% performance improvements for throughput-bound workloads

Serverless Elasticache with Multi-Threading

The multi-threading improvements enabled the development of a serverless version of Elasticache
Serverless Elasticache uses an NLB, proxy fleet, and dynamic scaling of caching nodes to provide 100% utilized service
Customers can now pay per request and per memory usage, without having to manage provisioning and scaling

Reliability Improvements

Importance of Cache Reliability

Caching is critical for performance, and customers want the cache to be significantly more available than the backend services
Valky focuses on two key areas for reliability: managed upgrades and unmanaged node failures

Managed Upgrades

Valky uses an "N+K" upgrade strategy, adding new replica nodes during version upgrades
The upgrade process involves taking a full snapshot, streaming incremental writes, and safely transferring shard ownership
Improvements include forward-compatible snapshots and dual-channel replication to reduce primary load

Unmanaged Node Failures

Valky uses a cluster bus to detect node failures and trigger failovers
The failover process involves a quorum-based election, with improvements to handle multiple concurrent failures
Valky 8.1 introduced a lexicographic-based election process to dramatically reduce failover times, from minutes to 10-15 seconds

Memory Efficiency Improvements

Memory Overhead in Valky 7.2

A simple key-value pair in Valky 7.2 had 52 bytes of overhead, which could be a significant portion of the total memory usage
Valky 8.0 and 8.1 introduced optimizations to the hash table data structure to reduce this overhead

Hash Table Optimizations

Moved from a static structure to a variable-sized allocation, removing the need for extra pointers
Embedded the key and value directly into the hash table entries, eliminating the need for separate objects
Utilized SIMD instructions to efficiently check for key matches within hash table buckets
Replaced linked list collision resolution with linear probing and "Swiss table" techniques

Memory Savings

Valky 8.0 reduced memory overhead by 20%, and Valky 8.1 provided an additional 27% reduction
A customer example saw a 41% total memory reduction when migrating from Valky 7.2 to 8.1, with no performance regression

Bloom Filters

Valky 8.1 introduced bloom filters, a probabilistic data structure that can efficiently represent sets
Bloom filters allow for significant memory savings in use cases like IP blacklisting, with a customer seeing a 98% memory reduction

Future Roadmap and Opportunities

Upcoming Valky Improvements

Full-text search capabilities
Improved durability through integration with the durable MemoryDB service
Multi-threaded snapshot processing for faster managed upgrades
New data types, such as time series, to better fit customer workloads

Open Source Collaboration

Valky's roadmap and development are fully open, encouraging community contributions and feedback
The team welcomes participation from developers, even those not familiar with the underlying C codebase
Customers are encouraged to share their use cases and requirements to help shape the future of Valky

Your Digital Journey deserves a great story.

Build one with us.

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

AWS re:Invent 2025 - Better, faster, cheaper: How Valkey is revolutionizing caching (DAT458)

Valky: Revolutionizing Caching Performance and Reliability

Introduction to Valky

Origin of Valky

Valky Governance and Adoption

Performance Improvements with Multi-Threaded Architecture

Challenges with Spiky Caching Workloads

Limitations of Single-Threaded Valky 7.2

Multi-Threaded Architecture in Valky 8.0

Serverless Elasticache with Multi-Threading

Reliability Improvements

Importance of Cache Reliability

Managed Upgrades

Unmanaged Node Failures

Memory Efficiency Improvements

Memory Overhead in Valky 7.2

Hash Table Optimizations

Memory Savings

Bloom Filters

Future Roadmap and Opportunities

Upcoming Valky Improvements

Open Source Collaboration

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Better, faster, cheaper: How Valkey is revolutionizing caching (DAT458)

Valky: Revolutionizing Caching Performance and Reliability

Introduction to Valky

Origin of Valky

Valky Governance and Adoption

Performance Improvements with Multi-Threaded Architecture

Challenges with Spiky Caching Workloads

Limitations of Single-Threaded Valky 7.2

Multi-Threaded Architecture in Valky 8.0

Serverless Elasticache with Multi-Threading

Reliability Improvements

Importance of Cache Reliability

Managed Upgrades

Unmanaged Node Failures

Memory Efficiency Improvements

Memory Overhead in Valky 7.2

Hash Table Optimizations

Memory Savings

Bloom Filters

Future Roadmap and Opportunities

Upcoming Valky Improvements

Open Source Collaboration

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.