Talks AWS re:Invent 2025 - Best practices to simplify resilience at scale for Gen AI data & apps (STG317) VIDEO
AWS re:Invent 2025 - Best practices to simplify resilience at scale for Gen AI data & apps (STG317) Simplifying Resilience at Scale for Gen AI Data & Apps
Importance of Resilience in the Cloud
52% of data leaders surveyed by AWS said their data foundations are not ready for AI implementation
Common resilience challenges in the cloud:
Accidental deletion of critical data
Software issues causing service disruptions
Malicious attacks like ransomware
Resilience for NoSQL Databases (DynamoDB)
DynamoDB is a popular choice for storing user profile and watch list data due to its scalability and partitioning
However, DynamoDB is vulnerable to issues like:
Canary deployments corrupting partitions
Lack of a single known good recovery point across partitions
Recovery process without Clumio is complex and time-consuming:
Requires restoring each impacted partition individually
Involves cherry-picking data and reconfiguring applications
Clumio Backtrack for DynamoDB simplifies recovery:
Allows recovery to any point-in-time with in-place restoration
No need to reconfigure applications or create temporary tables
Resilience for LLM-Powered Chatbots (S3 & Vectors)
Chatbot functionality relies on movie data stored in S3 and vector embeddings
Loss of S3 data can render the vector store useless, causing chatbot failures
Recovery without Clumio is complex:
Requires full S3 bucket restore
Needs to recompute vectors and reconfigure the LLM
Clumio Backtrack for S3 enables simple recovery:
Granular recovery of only impacted objects
No need to recompute vectors or reconfigure the LLM
Resilience for Data Lakehouse (Apache Iceberg on S3)
Movie insights feature uses an Apache Iceberg data lakehouse on S3
Iceberg data is vulnerable to schema changes and data overwrites
Recovery without Clumio is challenging:
Requires restoring S3 data and rebuilding the Iceberg table structure
Needs to reconfigure applications and dashboards
Clumio Backtrack for Iceberg provides seamless recovery:
Preserves the Iceberg table structure during backup and restore
Supports converting between Glue and S3 Tables catalogs
Enables point-in-time recovery without application reconfiguration
Key Recommendations for Resilient Gen AI Apps
Protect the entire data pipeline, not just individual components
Ensure fast recovery to minimize user/business disruption
Architect for dynamic, scalable cloud environments
Clumio's Approach to Resilience
Recovery in place to avoid application reconfiguration
Automated discovery of existing and new cloud resources
Fully serverless and elastic architecture to scale with the cloud
Next Steps
Sign up for a free 14-day trial of Clumio on AWS Marketplace
Learn more about Clumio Backtrack for DynamoDB and Iceberg via the provided QR codes
Your Digital Journey deserves a great story. Build one with us.