Maximize the value of cold data with Amazon S3 Glacier storage classes (STG207)

Why Cold Data is Important

70-80% of the world's data is "cold" data - data that is accessed less than once per quarter

Customers store cold data for various reasons:

Preservation: Storing media files, historical data in a data lake for future use
Backup: Ensuring data can be restored when needed, within recovery time objectives
Compliance: Retaining data for 5-10+ years to meet regulatory or self-imposed requirements

Storing Cold Data in Amazon S3

Amazon S3 offers different storage classes with varying access performance and costs:

S3 Lifecycle policies can automatically transition data between storage classes as it becomes colder:

E.g. Move data to Glacier Instant Retrieval after 90 days, then to Glacier Deep Archive after 180 days
Policies can be configured based on object size, age, tags, and more

Amazon S3 Intelligent Tiering can automatically move data between access tiers without manual configuration

Restoring Cold Data from Amazon S3 Glacier

For Glacier Instant Retrieval, restoration is the same as for other S3 tiers - using the standard GET API

For Glacier Flexible Retrieval and Glacier Deep Archive, a restore request must be initiated:

Restore types: Bulk (free, 5-12 hours), Standard (minutes to 5 hours), Expedited (minutes, with provisioned capacity)
Batch Operations can be used to efficiently initiate and track large-scale restore requests
Restored data is temporarily stored in S3 Standard, before being deleted or moved to a different tier

Customer Example: Deluxe Media

Deluxe Media used microservices to individually restore objects from Glacier, leading to high costs and slow speeds

By transitioning to a Batch Operations workflow, Deluxe was able to:

Key Takeaways

Preserve your data assets cost-effectively using Amazon S3 Glacier

Choose the right S3 storage class based on your access needs and cost requirements

Leverage S3 Lifecycle policies and Intelligent Tiering to automate storage optimizations

Use Batch Operations to efficiently restore large amounts of cold data from Glacier

Storing and Restoring Cold Data in Amazon S3 Glacier