TalksAWS re:Invent 2025 - How Netflix uses Amazon S3 Storage Lens to track exabytes of data (STG214)
AWS re:Invent 2025 - How Netflix uses Amazon S3 Storage Lens to track exabytes of data (STG214)
Optimizing Exabyte-Scale Data Storage with Amazon S3 Storage Lens
Overview of Amazon S3 Insights Portfolio
Amazon S3 provides a suite of observability and analytics services to help customers gain visibility into their storage deployments:
S3 Storage Lens: Provides daily insights on storage and activity metrics at the organization, account, region, bucket, and prefix levels
S3 Metadata and S3 Inventory: Offer granular object-level reporting and metadata
S3 Server Access Logs: Capture detailed records of every request made to S3 buckets
Amazon CloudWatch: Integrates S3 metrics with other AWS service metrics for comprehensive monitoring
Netflix's Use of S3 Storage Lens
Netflix, a major S3 customer, has over 2 exabytes of data spread across big data, media, and machine learning use cases
Challenges include:
Diverse access patterns and abstractions built on S3
Multi-tenant buckets with different teams owning prefix paths
Lack of organization-wide visibility into usage and costs
Netflix's Approach:
Ingest S3 Storage Lens data into Apache Iceberg tables to aggregate usage trends
Create customized dashboards for different application owners to monitor their S3 usage
Implement automated alerts to notify owners when storage crosses certain thresholds
Leverage S3 access logs and inventory data to identify unused or underutilized data for optimization
Optimizing Data Placement with S3 Storage Lens Insights
Netflix used S3 Storage Lens metrics to identify performance issues with machine learning training jobs accessing data from S3
Identified throttling issues causing long cold starts and idle GPUs
Moved high-performance data to EBS or FSx Lustre to eliminate cold starts and optimize GPU utilization
Integrating S3 Storage Lens into Platform Insights
Netflix ingests S3 Storage Lens data into internal platforms and dashboards
Allows application owners to easily access storage metrics and make informed decisions about their data usage
Empowers teams to self-manage their S3 usage without a central authority
Key Recommendations for Adopting S3 Storage Lens
Usage Trends: Ingest Storage Lens data into dashboards and review periodically
Automated Alerts: Set up automated growth alerts to notify application owners of changes
Platform Insights: Integrate Storage Lens data into internal platforms and dashboards for self-service access
New S3 Storage Lens Features
Expanded Prefix Analytics:
Provides metrics for all prefixes in a bucket, not just the top 1% by size
Supports up to 50 prefix levels and unlimited prefixes per bucket
Performance Insights:
72 new metrics focused on application performance when accessing S3 data
Identify inefficient request patterns, cross-region access, and opportunities for caching
S3 Table Exports:
Export Storage Lens metrics directly to managed S3 tables for easy querying
Enables natural language-based analysis using AI assistants like Amazon Kendra
Key Takeaways
S3 Storage Lens provides comprehensive visibility into storage usage, access patterns, and performance at scale
Customers like Netflix leverage Storage Lens insights to optimize costs, improve application performance, and enable self-service access for teams
New features in Storage Lens expand prefix-level analytics, add performance-focused metrics, and simplify analysis through S3 table exports and natural language querying
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.