TalksAWS re:Invent 2025 - AWS storage beyond data boundaries: Building the data foundation (INV215)
AWS re:Invent 2025 - AWS storage beyond data boundaries: Building the data foundation (INV215)
AWS re:Invent 2025 - AWS Storage Beyond Data Boundaries
Building Blocks and Fundamentals
AWS engineering teams focus on building "building blocks" - polished, low-friction services and primitives that enable builders to create higher-level products.
The measure of success for these building blocks is a lack of friction - the teams aim to be "invisible" by removing thankless, effortful work.
Key fundamentals that the teams focus on are security, durability, availability, performance, and elasticity.
Over 80% of the work is "quiet innovative" - continuous refinement, rewriting, and improvement of existing services like Amazon S3.
Since 2020, there have been over 1,000 launches and improvements to AWS storage services.
Scaling Amazon S3
Amazon S3 is nearly 20 years old and stores over 500 trillion objects, serving over 200 million requests per second globally.
To handle this scale, the teams have had to continuously rewrite and reinvent the underlying data path, often rewriting components in Rust.
One example is the launch of conditional PUT operations in S3, which allows distributed applications to manage racing updates more easily.
Other recent improvements include increasing max object size to 50TB, adding conditional copy and delete, atomic rename in S3 Express, and batch operations with prefix support.
Hardware and Software Innovation
S3 was originally built on hard drive-based "JBOD" (just a bunch of disks) servers, which the teams continuously optimized for density and efficiency.
The teams eventually moved to a more flexible "Metal Volumes" architecture, which decouples the physical storage from the compute hosts using Nitro-based virtualization.
This allows for more flexibility in scaling compute and storage independently, as well as reducing power consumption by over 10%.
The Metal Volumes architecture also introduced "Nitro Offloads" to perform low-level storage maintenance operations directly on the drives, reducing network traffic.
S3 Express and Vectors
S3 Express is a high-performance, zonal version of S3 optimized for file-style workloads, with features like object rename.
One S3 Express customer, Meta, is using it as a backend for large-scale ML training, with over 60PB of storage and 1M transactions per second.
S3 Vectors is a new vector database API built on top of S3, allowing for similarity search and vector indexing of data.
Vector databases are typically in-memory, but the S3 Vectors team designed a solution to leverage S3's throughput-optimized architecture, using "neighborhoods" of vectors stored as objects.
In the first 5 months of the Vectors preview, over 250,000 vector indices were created, 40 billion vectors ingested, and 1 billion queries served.
S3 Tables and Metadata
S3 Tables offer managed Apache Iceberg tables on top of S3, providing a structured, mutable data store.
Over 400,000 S3 Tables have been created, with customers like Indeed using them to power their 85PB data lake.
S3 Metadata provides a managed, SQL-queryable view of the contents and mutations within an S3 bucket, enabling better visibility and analysis.
The S3 Metadata pattern has been extended to other AWS services, allowing joins across logs and object data for advanced analytics and troubleshooting.
Enterprise Migration and FSx Integration
AWS FSx provides managed file storage services (e.g. FSx for NetApp, FSx for Windows) to ease enterprise migrations to the cloud.
S3 Access Points for FSx allow enterprises to access their existing file data through S3 APIs, enabling cloud-first development.
This integration between file storage and object storage primitives simplifies the process of migrating complex enterprise data to the cloud.
Key Takeaways
AWS storage teams focus relentlessly on polishing and improving core building blocks like Amazon S3, with over 1,000 launches in the past 3 years.
Innovations span hardware, software, and new data primitives like S3 Vectors and S3 Tables to address evolving customer needs.
The teams optimize for scale, flexibility, and cost-efficiency, driving 10%+ power savings in some cases.
New services like S3 Vectors and S3 Metadata enable advanced analytics and AI/ML use cases on top of S3 data.
Tight integration between file storage (FSx) and object storage (S3) simplifies enterprise migrations to the cloud.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.