Modern data patterns for modern data strategies (INV201)

Embracing the Power of Data: A Journey through Three Data Patterns

Key Takeaways

  1. Aggregate Data Pattern:

    • Brings together data from various sources into Amazon S3, enabling federated ownership and access for application owners.
    • Standardization practices like using Parquet files and Apache Iceberg are crucial for success.
    • New capabilities like Amazon S3 Tables and S3 Metadata make the Aggregate Data Pattern even more powerful.
  2. Curate Data Pattern:

    • Application developers access curated, high-quality datasets through internal or external data marketplaces.
    • Data and business catalogs like AWS Glue Data Catalog and Amazon Data Zone help with standardization and governance.
    • Data products and internal data marketplaces, as seen in examples like Cox Automotive and Siemens, enable self-service data access.
  3. Extend Data Pattern:

    • Adds a data API layer to access curated datasets, providing control, governance, and standardization.
    • Enables the use of AI and machine learning, with capabilities like Bedrock Agents assisting in data processing workflows.
    • Salesforce's Data Cloud showcases the power of a data API, integrating structured, unstructured, and contextual data to drive customer-centric applications.

Aggregate Data Pattern

  • Bringing together data from various sources into Amazon S3 is the foundational first step for many customers.
  • The federated ownership model and access to diverse datasets enable applications like fraud analytics and knowledge bases.
  • Standardization practices, such as using Parquet files and Apache Iceberg, are crucial for success in the Aggregate Data Pattern.
  • New AWS capabilities like Amazon S3 Tables and S3 Metadata further enhance the Aggregate Data Pattern.

Curate Data Pattern

  • Instead of individual application owners accessing aggregated data, the Curate Data Pattern focuses on standardizing on a few high-quality datasets.
  • Data and business catalogs, like AWS Glue Data Catalog and Amazon Data Zone, help with data discovery and governance.
  • Internal and external data marketplaces, as seen in examples like Cox Automotive and Siemens, enable self-service data access for application developers.

Extend Data Pattern

  • The Extend Data Pattern adds a data API layer to access curated datasets, providing control, governance, and standardization.
  • The API serves as an intermediary, allowing the application owners to leverage the data without building and owning their own pipelines.
  • Capabilities like Bedrock Agents, which leverage foundation models for data processing tasks, can be integrated into the data API.
  • Salesforce's Data Cloud showcases the power of a data API, seamlessly integrating structured, unstructured, and contextual data to drive customer-centric applications.

Conclusion

AWS provides the flexibility to choose and apply the data pattern that best suits the needs of different business units within an organization. The Aggregate, Curate, and Extend patterns can be deployed independently or in combination, depending on the specific requirements. With continuous innovation from AWS, customers can leverage the latest capabilities to accelerate their data-driven transformation.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us