Introduction
- Lee Slezak: Senior Vice President of Data and Analytics for Lennar Homes
- Varun: Senior Director of Data and Analytics for Lennar Homes
- Mylin Ackermann: Senior Solutions Architect within the Data and AI Practice at PwC
- Rohit Sinha: Managing Director in PwC's Data and AI Practice, focused on AWS for the firm
The session covers Lennar's journey in building a modern AWS data and AI platform to scale AI at the organization.
Where Lennar Was
- Siloed approach to data, with multiple legacy systems and previous failed consolidation attempts
- Operational issues with constant data quality and latency problems
- Initial MVP data platform was just a data warehouse, lacking a Data Lake and model data
Lennar's Approach
- Partnered with PwC to identify a strategic first use case and build trust with technology and business leadership
- Followed a domain-based migration approach to onboard use cases and users in a staggered fashion
Key requirements:
- Access to information in less than an hour latency
- Enable a multi-user platform for co-development with CI/CD and DevOps
- Provide analytics at scale with fine-grained access control
Architecture and Patterns
Core Tenets
- Cross-functional access to data with fine-grained access control
- Data quality and lineage as a foundational component
- Less than hourly reporting/recording of data
- Optimized cloud TCO through auto-scaling and segregation of consumption
Architecture Overview
- Medallion-style architecture with a Bronze (raw), Silver (curated), and Gold (dimensional/aggregated) layer
- Integrated Ataccama for data quality and lineage, along with a custom audit, balance, and control framework
- Leveraged AWS services like Glue, Lambda, EMR, and DBT for scalable data processing
- Used open-source Iceberg format in the Data Lake for schema evolution and performance
- Snowflake for dimensional and aggregated data
- Integrated with Palantir Foundry and Amazon SageMaker for ML and AI workflows
Ingestion and Processing Patterns
- Used DMS, AppFlow, Glue APIs, AWS Transfer Family, and Qlik for various ingestion patterns
- Bronze layer ingestion triggered by S3 events, loaded into pre-stage Silver layer
- 30-minute batch cycles to populate target-centric structures in Snowflake and Data Lake
- Leveraged DBT for smooth transition and performance benefits
Consumption Layer
- Snowflake for dimensional and aggregated structures to enable self-service analytics
- Palantir Foundry and Amazon SageMaker integrated for ML and AI workflows
Value Realization for Lennar
- Consolidated data platform to reduce costs and unify data
- Improved productivity with faster access to key metrics like home sales, closings, and starts
- Built trust through robust data management and lineage
- Laid the foundation for future advanced use cases, including generative AI
Key Takeaways
- The importance of a strategic partnership for end-to-end planning and execution
- Emphasis on architecture and design for scalability, repeatability, and maintainability
- Building a future-proof platform by leveraging external expertise and internal perspectives