Talks AWS re:Invent 2025 - Accelerate & automate secure data transfers at scale with AWS DataSync (STG340) VIDEO
AWS re:Invent 2025 - Accelerate & automate secure data transfers at scale with AWS DataSync (STG340) Accelerating Secure Data Transfers at Scale with AWS DataSync
Overview
Enterprises are creating exabytes of data every day, distributed across on-premises, edge, and multi-cloud environments
This creates challenges around data migration, governance, security, and performance at scale
AWS DataSync is a fully managed data transfer service designed to address these challenges
Key Use Cases for DataSync
Migrations : Quickly and easily migrate file and object data from on-premises or other clouds to AWS
Replication : Create secondary copies of data for disaster recovery
Archive : Move cold, infrequently accessed data to cost-effective AWS storage like S3 Glacier
Accelerated Workflows : Enable high-speed data transfers to support business-critical workloads
DataSync Capabilities
Supports data movement from on-premises storage, other clouds, and between AWS services
Preserves file metadata like permissions, timestamps, and ACLs during transfers
Provides advanced features like flexible scheduling, bandwidth control, and detailed reporting
Fully managed service that handles the underlying infrastructure and network optimization
DataSync Enhanced Mode
Enables virtually unlimited file transfers for S3 and cross-cloud scenarios
Increases transfer speeds for large files by breaking them into parallel chunks
Simplifies cross-cloud transfers by eliminating the need for agents in the other cloud
Path AI's Use Case
Path AI digitizes pathology workflows by converting glass slides to digital images
This generates massive amounts of data (gigabytes per slide) that needs to be securely transferred to the cloud
Path AI used DataSync to build a seamless data pipeline, allowing labs to push data to S3 without complex IT setups
This enabled Path AI to onboard labs in the US, Europe, and South America, moving peta-bytes of data to power their AI-based pathology platform
Optimizing DataSync Performance
DataSync Agent Deployment : Agents can run as EC2 instances or on-premises VMs - on-premises is recommended for low-latency access to storage
Testing and Validation : Run small-scale tests to validate connectivity, performance, and error recovery before full migrations
Scaling with Multiple Tasks : Partition data sets and run multiple parallel DataSync tasks to maximize available network bandwidth
Key Takeaways
DataSync provides a fully managed, secure, and scalable solution for large-scale data migrations and transfers
Enhanced mode enables increased performance and simplified cross-cloud workflows
Path AI used DataSync to build a robust data pipeline, enabling digital pathology workflows powered by cloud-based AI
DataSync can be optimized through agent placement, testing, and parallel task execution to achieve high-speed data transfers
Your Digital Journey deserves a great story. Build one with us.