TalksAWS re:Invent 2025 - What's new with Amazon SageMaker in the era of unified data and AI (ANT216)

AWS re:Invent 2025 - What's new with Amazon SageMaker in the era of unified data and AI (ANT216)

Summary of AWS re:Invent 2025 - What's new with Amazon SageMaker in the era of unified data and AI (ANT216)

Overview of Next Generation Amazon SageMaker

  • AWS introduced the next generation of Amazon SageMaker at re:Invent 2024 to address key challenges faced by customers:
    • Rapidly evolving data and AI landscape requiring manual stitching of complex tool sets
    • Data sprawl and fragmented governance leading to difficulty leveraging data for AI
    • Need to integrate data analytics and AI at scale across teams and workflows
  • The next generation of SageMaker aims to provide a unified experience by:
    1. Offering a SageMaker Unified Studio with integrated tools for SQL, Python, visual workflows, and natural language interaction
    2. Providing a SageMaker Catalog for centralized data and AI governance and access control
    3. Building on an open data lakehouse architecture using Apache Iceberg for consistent data access

SageMaker Unified Studio

  • Unified Studio removes the pain of separate development experiences by integrating various tools and interfaces:
    • Code editors for Python, SQL query editors, visual drag-and-drop interfaces
    • Natural language interaction through a generative AI assistant
    • Seamless integration with underlying AWS data processing and compute services
  • Enables teams to collaborate without context switching, with all data accessible in one place
  • Supports a wide range of user preferences and skill levels

SageMaker Catalog for Governance

  • SageMaker Catalog provides a single place for data and AI governance with access controls
    • Allows secure discovery and access to approved data models, compute, and AI assets
    • Provides semantic search, data quality monitoring, sensitivity detection, and end-to-end lineage
    • Integrates Amazon Bedrock guardrails for responsible AI, including blocking sensitive information and reducing hallucinations

Open Data Lakehouse Architecture

  • SageMaker is built on an open data lakehouse architecture using Apache Iceberg
    • Enables consistent data access across S3, Redshift, and other Iceberg-compatible engines
    • Provides 15+ zero-ETL integrations to bring in data from operational and enterprise applications
    • Allows federated queries to access on-premises and third-party data sources

Key Launches and Enhancements

  1. Visual Workflows and Job Monitoring for Data Engineers:

    • Expanded visual ETL interface to automate Glue jobs and other tasks
    • Allows data engineers to create and visualize complex pipelines without Python DAG code
  2. Unstructured Data Support in SageMaker Catalog:

    • Brought S3 buckets into the SageMaker Catalog to manage unstructured data assets
    • Enables data producers to enrich unstructured data with metadata for data consumers
  3. QuickSight Integration for Data Analysts:

    • Allows data analysts to quickly visualize data sets in SageMaker and publish dashboards
    • Integrates with the SageMaker Catalog for discovery and access control
  4. S3 Table Governance with Tag-based Access Control:

    • Expanded tag-based access control to S3 tables, Redshift, and federated data sources
    • Enables data administrators to define permissions based on tags, scaling access control

Recent Launches

  1. One-click Onboarding with IAM-based Domains:

    • Allows users to leverage existing IAM roles to set up SageMaker Unified Studio in minutes
    • Automatically configures access to S3, Glue Data Catalog, and other resources based on the IAM role
  2. Serverless Notebook Experience:

    • Introduced a new serverless, web-based notebook environment for interactive data exploration and model development
    • Supports both Python and SQL natively, with automatic data transfer between the two
  3. SageMaker Data Agent:

    • A purpose-built AI agent that leverages metadata to provide contextual guidance and code generation
    • Helps users explore data, visualize insights, and build end-to-end ML workflows without manual setup
  4. Amazon Athena for Apache Spark Engine:

    • A new high-performance query engine that powers the serverless notebook experience
    • Provides scalable processing of data in the SageMaker lakehouse architecture

Customer Feedback and Impact

  • Customers like Deloitte have seen faster delivery and experimentation with the developer-focused, unified SageMaker experience
  • The new capabilities have enabled users across data engineering, data science, and analytics to collaborate more effectively on data and AI initiatives

Key Takeaways

  • AWS has introduced a next-generation Amazon SageMaker platform that unifies data analytics, machine learning, and AI capabilities
  • The SageMaker Unified Studio, Catalog, and open lakehouse architecture aim to address challenges around data sprawl, fragmented governance, and the need for integrated data and AI workflows
  • Recent launches have focused on improving onboarding, developer experience, and AI-assisted data exploration and model development
  • Customers have reported increased productivity and collaboration across data and AI teams leveraging the new SageMaker capabilities

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.