What’s new with Amazon S3 (STG212)

Here is a detailed summary of the video transcription in the requested markdown format:

Overview

The presenters, Paul Megan and Mallerie Gershenfeld from the Amazon S3 team, provide an overview of 30 launches that have been delivered over the past 12 months since Re:Invent 2021.

Structured Data

S3 Tables: Fully Managed Apache Iceberg Tables in S3

  • S3 has become a tabular data store, with exabytes of Parquet storage and serving over 15 million requests per second.
  • Customers have innovated on top of this Parquet data using open table formats like Apache Iceberg.
  • S3 Tables are fully managed Apache Iceberg tables within S3, providing:
    • Table-level APIs for CRUD operations
    • Automatic performance optimizations (up to 10x TPS, 3x query performance)
    • Simplified security with table-level policies
    • Automatic storage cost optimization

Unstructured Data

S3 Metadata: Automatic Metadata Generation

  • S3 now has over 400 trillion objects, many of which are unstructured data.
  • Customers are struggling to manage and find objects in their large unstructured data stores.
  • S3 Metadata provides:
    • Automatic metadata generation for every object, accessible via SQL
    • Use cases:
      • Finding objects to process
      • Understanding data lineage
      • Analyzing storage usage

Fundamentals

Scale

  • S3 bucket limit increased from 1,000 to 1 million buckets per account.
  • Additional changes to manage large number of buckets, such as service quotas and improved list buckets API.

Durability

  • Improved end-to-end data integrity checking, including client-side checksums.
  • Introduced conditional requests (e.g., put-if-absent) for distributed applications.

Security

  • Added more context to 403 access denied errors to aid troubleshooting.
  • Expanded support for AWS Identity and Access Management (IAM) Access Grants.

Client Portfolio

Performance Optimizations

  • S3 Express One Zone for low-latency access.
  • Integrations with services like SageMaker, EMR, and AWS KMS.
  • Support for appending data to existing objects.
  • Expansion to new AWS Regions, including dedicated local zones.

Client-side Innovations

  • Mount Point for Amazon S3: Mounted S3 buckets as local file systems.
  • S3 Connector for PyTorch: High-throughput data access for ML training.
  • Storage Browser and Transfer Family Web Apps: Managed web-based S3 access.
  • Amplify Hosting integration: One-click static website hosting on S3.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us