TalksAWS re:Invent 2025 - Data protection strategies for AI data foundation (AIM339)
AWS re:Invent 2025 - Data protection strategies for AI data foundation (AIM339)
AWS re:Invent 2025 - Data Protection Strategies for AI Data Foundation
Overview
This presentation from AWS re:Invent 2025 focuses on strategies for securing and protecting sensitive data used in AI and machine learning applications, particularly in the context of a nonprofit healthcare chatbot. The speakers, Derek Martinez and Sabrina Petruso, outline a comprehensive data protection framework and demonstrate the implementation of key security and privacy controls through a live coding example.
Key Challenges Addressed
Handling sensitive patient data in AI/ML applications
Mitigating risks of prompt injection attacks on language models
Ensuring data privacy and compliance (e.g. HIPAA) in AI data pipelines
Data Protection Framework
The presenters outline a 6-layer data protection framework:
1. Encryption
Encrypt data at rest and in transit using AWS KMS
2. Fine-Grained Access Control
Leverage IAM to implement least-privilege access controls
3. Auditing and Monitoring
Use AWS CloudTrail to log and audit all actions
4. Automated Compliance
Leverage AWS Config to define and monitor compliance rules (e.g. HIPAA)
5. PII Detection and Sanitization
Utilize Amazon Textract and Amazon Comprehend to detect and mask PII
Implement differential privacy techniques like k-anonymity and randomization
6. Prompt Injection Defense
Detect and mitigate potential prompt injection attacks on the chatbot
Live Coding Example
The presenters walk through a live coding example of the data protection pipeline built on Amazon SageMaker:
1. Data Ingestion
Internal data owner uploads raw patient data to an S3 bucket
2. Data Processing
Amazon Textract extracts text from documents
Amazon Comprehend detects and masks PII using differential privacy
Processed data is stored in a separate S3 bucket
3. Prompt Injection Defense
API Gateway exposes a backend Lambda function to process user prompts
Lambda function checks for and mitigates potential prompt injection attacks
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.