AWS re:Invent 2025 - Tracing the Untraceable: Full-Stack Observability for LLMs and Agents (AIM212)

Tracing the Untraceable: Full-Stack Observability for LLMs and Agents

The Importance of LLM Observability

The complexity of modern AI-powered applications has grown significantly, with the use of large language models (LLMs), multi-turn agents, retrieval pipelines, and tool augmentation.

Key observability concerns include:

Token usage monitoring for cost management
Response latency tracking for customer-facing applications
Error rate detection for identifying model failures
Monitoring LLM responses for hallucinations and accuracy

The rapid pace of change in AI-powered applications makes it challenging to observe and instrument these systems effectively.

Sensitive user data introduced through LLM interactions also raises privacy and security concerns.

Groundcover's Approach to LLM Observability

Bring Your Own Cloud architecture: Groundcover deploys and manages an observability backend within the customer's own AWS environment, ensuring data privacy and control.

eBPF-based instrumentation: Groundcover uses eBPF, a kernel-level visibility technology, to automatically instrument applications without modifying code or redeploying.

Real-time insights: The eBPF-based instrumentation allows Groundcover to capture full API requests and responses, including LLM prompts and outputs, and correlate them with other observability data.

Key Observability Capabilities

Protecting LLM Pipelines from Data Exposure:

Sensitive user data captured by the eBPF instrumentation is stored within the customer's own environment, with no third-party access or egress.

Improving Performance and Managing Spend:

Monitoring token usage, latency, throughput, and error rates to optimize LLM performance and cost.

Monitoring LLM Responses:

Analyzing LLM response payloads, reasoning paths, and response variations to ensure accuracy and identify issues.

Observability in Action

Deployment and Dependency Mapping:

Groundcover automatically detects and instruments LLM-powered applications, such as Bedrock, without any manual configuration.
Provides a dependency map of microservices and their interactions.

API Call Visibility:

Displays detailed metrics on LLM API calls, including latency, throughput, and model usage.

Trace Analysis:

Captures full API request and response details, including LLM prompts and outputs, and correlates them with logs and infrastructure metrics.
Enables troubleshooting of errors and high-latency issues.

Performance and Cost Optimization:

Analyzes P90 latency and token usage trends to identify performance bottlenecks and cost optimization opportunities.
Allows setting SLA-based alerts for latency breaches.

Dashboarding and Reporting:

Provides pre-built dashboards for monitoring LLM usage, performance, and cost across the organization.
Enables custom dashboards and reports to fit specific business needs.

Key Takeaways

Groundcover's approach to LLM observability addresses the challenges of rapid change, data sensitivity, and the need for deep visibility into AI-powered applications.

The combination of eBPF-based instrumentation and a Bring Your Own Cloud architecture enables comprehensive observability without compromising data privacy and security.

The observability capabilities provided by Groundcover allow organizations to protect sensitive data, optimize LLM performance and cost, and ensure the accuracy and reliability of their AI-powered applications.

The detailed examples and use cases demonstrate the practical value of Groundcover's LLM observability solution in real-world scenarios.

AWS re:Invent 2025 - Tracing the Untraceable: Full-Stack Observability for LLMs and Agents (AIM212)

Tracing the Untraceable: Full-Stack Observability for LLMs and Agents

The Importance of LLM Observability

Groundcover's Approach to LLM Observability

Key Observability Capabilities

Observability in Action

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Tracing the Untraceable: Full-Stack Observability for LLMs and Agents (AIM212)

Tracing the Untraceable: Full-Stack Observability for LLMs and Agents

The Importance of LLM Observability

Groundcover's Approach to LLM Observability

Key Observability Capabilities

Observability in Action

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.