Here is a detailed summary of the video transcription in the requested markdown format:
Overview
The presenters, Paul Megan and Mallerie Gershenfeld from the Amazon S3 team, provide an overview of 30 launches that have been delivered over the past 12 months since Re:Invent 2021.
Structured Data
S3 Tables: Fully Managed Apache Iceberg Tables in S3
- S3 has become a tabular data store, with exabytes of Parquet storage and serving over 15 million requests per second.
- Customers have innovated on top of this Parquet data using open table formats like Apache Iceberg.
- S3 Tables are fully managed Apache Iceberg tables within S3, providing:
- Table-level APIs for CRUD operations
- Automatic performance optimizations (up to 10x TPS, 3x query performance)
- Simplified security with table-level policies
- Automatic storage cost optimization
Unstructured Data
S3 Metadata: Automatic Metadata Generation
- S3 now has over 400 trillion objects, many of which are unstructured data.
- Customers are struggling to manage and find objects in their large unstructured data stores.
- S3 Metadata provides:
- Automatic metadata generation for every object, accessible via SQL
- Use cases:
- Finding objects to process
- Understanding data lineage
- Analyzing storage usage
Fundamentals
Scale
- S3 bucket limit increased from 1,000 to 1 million buckets per account.
- Additional changes to manage large number of buckets, such as service quotas and improved list buckets API.
Durability
- Improved end-to-end data integrity checking, including client-side checksums.
- Introduced conditional requests (e.g., put-if-absent) for distributed applications.
Security
- Added more context to 403 access denied errors to aid troubleshooting.
- Expanded support for AWS Identity and Access Management (IAM) Access Grants.
Client Portfolio
Performance Optimizations
- S3 Express One Zone for low-latency access.
- Integrations with services like SageMaker, EMR, and AWS KMS.
- Support for appending data to existing objects.
- Expansion to new AWS Regions, including dedicated local zones.
Client-side Innovations
- Mount Point for Amazon S3: Mounted S3 buckets as local file systems.
- S3 Connector for PyTorch: High-throughput data access for ML training.
- Storage Browser and Transfer Family Web Apps: Managed web-based S3 access.
- Amplify Hosting integration: One-click static website hosting on S3.