TalksAWS re:Invent 2025 - Tracing the Untraceable: Full-Stack Observability for LLMs and Agents (AIM212)
AWS re:Invent 2025 - Tracing the Untraceable: Full-Stack Observability for LLMs and Agents (AIM212)
Tracing the Untraceable: Full-Stack Observability for LLMs and Agents
The Importance of LLM Observability
The complexity of modern AI-powered applications has grown significantly, with the use of large language models (LLMs), multi-turn agents, retrieval pipelines, and tool augmentation.
Key observability concerns include:
Token usage monitoring for cost management
Response latency tracking for customer-facing applications
Error rate detection for identifying model failures
Monitoring LLM responses for hallucinations and accuracy
The rapid pace of change in AI-powered applications makes it challenging to observe and instrument these systems effectively.
Sensitive user data introduced through LLM interactions also raises privacy and security concerns.
Groundcover's Approach to LLM Observability
Bring Your Own Cloud architecture: Groundcover deploys and manages an observability backend within the customer's own AWS environment, ensuring data privacy and control.
eBPF-based instrumentation: Groundcover uses eBPF, a kernel-level visibility technology, to automatically instrument applications without modifying code or redeploying.
Real-time insights: The eBPF-based instrumentation allows Groundcover to capture full API requests and responses, including LLM prompts and outputs, and correlate them with other observability data.
Key Observability Capabilities
Protecting LLM Pipelines from Data Exposure:
Sensitive user data captured by the eBPF instrumentation is stored within the customer's own environment, with no third-party access or egress.
Improving Performance and Managing Spend:
Monitoring token usage, latency, throughput, and error rates to optimize LLM performance and cost.
Monitoring LLM Responses:
Analyzing LLM response payloads, reasoning paths, and response variations to ensure accuracy and identify issues.
Observability in Action
Deployment and Dependency Mapping:
Groundcover automatically detects and instruments LLM-powered applications, such as Bedrock, without any manual configuration.
Provides a dependency map of microservices and their interactions.
API Call Visibility:
Displays detailed metrics on LLM API calls, including latency, throughput, and model usage.
Trace Analysis:
Captures full API request and response details, including LLM prompts and outputs, and correlates them with logs and infrastructure metrics.
Enables troubleshooting of errors and high-latency issues.
Performance and Cost Optimization:
Analyzes P90 latency and token usage trends to identify performance bottlenecks and cost optimization opportunities.
Allows setting SLA-based alerts for latency breaches.
Dashboarding and Reporting:
Provides pre-built dashboards for monitoring LLM usage, performance, and cost across the organization.
Enables custom dashboards and reports to fit specific business needs.
Key Takeaways
Groundcover's approach to LLM observability addresses the challenges of rapid change, data sensitivity, and the need for deep visibility into AI-powered applications.
The combination of eBPF-based instrumentation and a Bring Your Own Cloud architecture enables comprehensive observability without compromising data privacy and security.
The observability capabilities provided by Groundcover allow organizations to protect sensitive data, optimize LLM performance and cost, and ensure the accuracy and reliability of their AI-powered applications.
The detailed examples and use cases demonstrate the practical value of Groundcover's LLM observability solution in real-world scenarios.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.