AWS re:Invent 2025 - Fine-tuning models for accuracy and latency at Robinhood Markets (IND392)

Leveraging Fine-Tuning for Accuracy and Latency at Robinhood Markets

Robinhood's AI Vision and Mission

Robinhood's mission is to democratize finance for all, providing users the same level of support and insight as the ultra-wealthy

To achieve this, Robinhood believes harnessing the power of AI and machine learning is crucial

Key Generative AI Challenges

Robinhood faced challenges in improving accuracy, reducing cost, and lowering latency when using large language models (LLMs) in critical production workflows

Strategies employed include data curation, model right-sizing, fine-tuning, and optimized deployment options

Robinhood's Generative AI Use Cases

Cortex Digest: Automatically generates summaries explaining stock price movements to users

Fine-tuning helps with vocabulary, objectivity, and identifying important information

Custom Indicators and Scans: Allows users to create trading logic using natural language

Democratizes algorithmic trading by translating queries into executable code

CX AI Agent: Robinhood's customer support chatbot, built in multiple stages:

Intent understanding, planner/tool selection, and final answer generation

The Generative AI Trilemma

In the world of generative AI, cost, quality, and latency are often at odds with each other

Even small regressions in one area can jeopardize end-user experience

Robinhood's Tuning Roadmap

Prompt Tuning:

Optimizes prompts across multiple stages of the agent pipeline
Uses a prompt optimization loop to generate and evaluate prompt candidates

Trajectory Tuning:

Injects dynamic few-shot examples into the planner stage to improve quality
Balances quality uplift with increased context length and latency

Fine-Tuning:

Focuses on data quality over quantity when creating the training dataset
Leverages techniques like LoRA (Low-Rank Adaptation) to reduce trainable parameters

LoRA: Robinhood's Fine-Tuning Approach

LoRA significantly reduces the number of trainable parameters compared to full fine-tuning

Enables scalable fine-tuning across multiple use cases with:

Faster training times
Lower costs
Portable models

Robinhood integrates LoRA into their fine-tuning platform, leveraging AWS SageMaker and Bedrock

Results and Lessons Learned

Robinhood's LoRA-based fine-tuned models achieved over 50% latency savings compared to previous models

Maintained quality parity with frontier models

Key lessons:

Importance of robust evaluation frameworks
Data preparation strategy (quality over quantity)
Methodical approach to tuning techniques
Leveraging AWS services for inference optimization

Conclusion

Robinhood's sophisticated use of AWS services and fine-tuning techniques demonstrates the potential for generative AI in regulated industries

Their approach can serve as a model for other organizations looking to reliably deploy generative AI in production environments

AWS re:Invent 2025 - Fine-tuning models for accuracy and latency at Robinhood Markets (IND392)

Leveraging Fine-Tuning for Accuracy and Latency at Robinhood Markets

Robinhood's AI Vision and Mission

Key Generative AI Challenges

Robinhood's Generative AI Use Cases

The Generative AI Trilemma

Robinhood's Tuning Roadmap

LoRA: Robinhood's Fine-Tuning Approach

Results and Lessons Learned

Conclusion

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Fine-tuning models for accuracy and latency at Robinhood Markets (IND392)

Leveraging Fine-Tuning for Accuracy and Latency at Robinhood Markets

Robinhood's AI Vision and Mission

Key Generative AI Challenges

Robinhood's Generative AI Use Cases

The Generative AI Trilemma

Robinhood's Tuning Roadmap

LoRA: Robinhood's Fine-Tuning Approach

Results and Lessons Learned

Conclusion

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.