TalksAWS re:Invent 2025 - Maximize the value of cold data with Amazon S3 Glacier storage classes (STG208)
AWS re:Invent 2025 - Maximize the value of cold data with Amazon S3 Glacier storage classes (STG208)
Maximizing the Value of Cold Data with Amazon S3 Glacier Storage Classes
The Importance of Cold Data
S3 currently stores hundreds of trillions of objects, with 70-80% of that data being "cold" - rarely accessed data stored for months, years, or decades
Cold data is no longer just dormant storage, but a catalyst for innovation across industries
Customers are unlocking new insights and competitive advantages by leveraging their historical, archived data
S3 Storage Class Continuum
S3 offers a continuum of storage classes balancing access speed and cost efficiency:
S3 Standard: Millisecond access for active data
S3 Infrequent Access: Lower cost for less frequently accessed data
S3 Glacier Storage Classes:
Glacier Instant Retrieval: Fast access for rarely accessed critical data
Glacier Flexible Retrieval: Lower cost with flexible retrieval times
Glacier Deep Archive: Lowest cost storage for long-term retention
Automating Data Lifecycle Management
S3 Lifecycle Policies allow automatically transitioning data between storage classes as access patterns change
Policies can be applied to entire buckets or filtered by prefix, tags, size, and versioning
S3 Intelligent Tiering automatically moves data between access tiers based on usage patterns
Restoring Archived Data
Key use cases for restoring archived data:
Reviving historical content for new audiences
Leveraging archived data for strategic decision-making
Training machine learning models on vast historical datasets
Restoration options:
Glacier Instant Retrieval: Same API as S3, higher retrieval costs
Glacier Flexible Retrieval and Glacier Deep Archive: 3-step process (initiate, monitor, access)
Batch Operations: Optimize restore performance by maximizing transactions per second
New Archive-Focused Features
Compute Checksum Operation:
Allows verifying data integrity of objects stored in any S3 storage class, including Glacier
Eliminates the need to download objects to calculate checksums locally
Leverages S3 Batch Operations for efficient, scalable checksum verification
Provides detailed completion reports for auditing and compliance
S3 Metadata:
Automatically extracts and stores object metadata (tags, storage class, size, etc.) in queryable tables
Enables instant SQL queries and natural language searches on archived data
Provides a "system of record" for understanding the contents and state of S3 buckets
Democratizes access to insights from archived data for teams beyond just data engineers
Key Takeaways
Cold data is a valuable asset, not just dormant storage, with opportunities to drive innovation and competitive advantage
S3 Glacier storage classes offer a continuum of cost and access tradeoffs to optimize for different cold data use cases
Automated lifecycle management and restoration capabilities make it easy to manage and access archived data
New features like Compute Checksum and S3 Metadata simplify data integrity validation and enable rapid insights from archived data
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.