Here is a detailed summary of the video transcription in markdown format:
Machine Learning at CarGurus
Introduction
- The presenters are Jason Staley, a Solutions Architect at AWS, and Jason Tan and Ru, data scientists and engineers at CarGurus.
- CarGurus is an online automotive platform that connects car dealers and shoppers, providing tools and products to support the car shopping process.
- Machine learning and data are essential to CarGurus' business model and the value they provide to customers.
Industry Trends in Machine Learning and Digital Customer Experience
- Large datasets are becoming more useful, enabling more complex models and new insights.
- However, managing these large data models and ML models requires better practices, automation, and operational maturity.
- In digital customer experience, the goal is to improve customer outcomes through relevant recommendations, search results, and personalization.
- Timeliness is key - personalization is most valuable when it happens in real-time, not days or hours later.
CarGurus' Machine Learning Journey
- CarGurus has leveraged machine learning since its early days, powering features like:
- Recommendations
- Search optimization
- Marketing strategy
- Instant Market Value (IMV) algorithms for pricing transparency
- IMV is a key differentiator, using ML to estimate fair market prices for used vehicles and provide deal ratings to customers.
- Implementing IMV at scale for a diverse used vehicle market is a significant challenge that requires advanced ML techniques.
Scaling Machine Learning at CarGurus
- As CarGurus' ML use cases grew, they realized they needed to invest in their ML operations and tooling to scale effectively.
- Key goals:
- Consistent execution between development and production
- Ability to integrate existing models
- Enforce best practices and patterns
- Reduce friction in deployment
- Adoption of Amazon SageMaker was a foundational decision, providing a flexible, fully-featured ML platform.
CarGurus' ML Platform Architecture
- SageMaker Notebooks: Used for experimentation and prototyping, with secure access and cost tracking.
- CG SageMaker: An internal framework that abstracts complex SageMaker interactions, enforces best practices, and provides templates to bootstrap new projects.
- SageMaker Pipelines: Declarative, reproducible ML pipelines that handle training, evaluation, model registration, and deployment as shadow variants.
- Scheduling and Monitoring: Pipelines are automatically scheduled, with CloudWatch and SageMaker Model Monitor providing operational and model quality monitoring.
- Model Promotion: Healthy shadow variants can be promoted to production, leveraging the model registry and advanced deployment features.
Real-time Recommendations with a Feature Store
- CarGurus built a real-time data pipeline and feature store to power low-latency recommendations.
- Key requirements: managed services, consistency between training and inference, and leveraging existing feature definitions.
- The solution uses Kinesis, Flink, Kafka, and ElastiCache to ingest events, compute features, and serve them to real-time recommendation models.
- Balancing complexity and performance trade-offs was a key challenge, requiring close collaboration between ML engineers and data scientists.
Lessons Learned
- Standardizing best practices and encoding them into tooling is critical for scaling ML effectively.
- Automating ML operations, from training to deployment and monitoring, unlocks faster iteration and higher confidence.
- Involving both ML engineers and data scientists in the design of the ML platform ensures it meets the needs of the entire team.
- ML is an ongoing journey of learning and iteration - there is always more to improve.