Maximize the value of cold data with Amazon S3 Glacier storage classes (STG207)

Storing and Restoring Cold Data in Amazon S3 Glacier

Why Cold Data is Important

  • 70-80% of the world's data is "cold" data - data that is accessed less than once per quarter
  • Customers store cold data for various reasons:
    • Preservation: Storing media files, historical data in a data lake for future use
    • Backup: Ensuring data can be restored when needed, within recovery time objectives
    • Compliance: Retaining data for 5-10+ years to meet regulatory or self-imposed requirements

Storing Cold Data in Amazon S3

  • Amazon S3 offers different storage classes with varying access performance and costs:
    • S3 Standard: Frequently accessed data, higher storage cost
    • S3 Glacier Instant Retrieval: Instant access, lower storage cost
    • S3 Glacier Flexible Retrieval: Minutes to hours access, lower storage cost
    • S3 Glacier Deep Archive: Lowest storage cost, but 12+ hours access
  • S3 Lifecycle policies can automatically transition data between storage classes as it becomes colder:
    • E.g. Move data to Glacier Instant Retrieval after 90 days, then to Glacier Deep Archive after 180 days
    • Policies can be configured based on object size, age, tags, and more
  • Amazon S3 Intelligent Tiering can automatically move data between access tiers without manual configuration

Restoring Cold Data from Amazon S3 Glacier

  • For Glacier Instant Retrieval, restoration is the same as for other S3 tiers - using the standard GET API
  • For Glacier Flexible Retrieval and Glacier Deep Archive, a restore request must be initiated:
    • Restore types: Bulk (free, 5-12 hours), Standard (minutes to 5 hours), Expedited (minutes, with provisioned capacity)
    • Batch Operations can be used to efficiently initiate and track large-scale restore requests
    • Restored data is temporarily stored in S3 Standard, before being deleted or moved to a different tier

Customer Example: Deluxe Media

  • Deluxe Media used microservices to individually restore objects from Glacier, leading to high costs and slow speeds
  • By transitioning to a Batch Operations workflow, Deluxe was able to:
    • Reduce restore costs by using the free Bulk restore option
    • Decrease restore times from hours to minutes
    • Streamline their restoration process and microservices

Key Takeaways

  1. Preserve your data assets cost-effectively using Amazon S3 Glacier
  2. Choose the right S3 storage class based on your access needs and cost requirements
  3. Leverage S3 Lifecycle policies and Intelligent Tiering to automate storage optimizations
  4. Use Batch Operations to efficiently restore large amounts of cold data from Glacier

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us