TalksAWS re:Invent 2025 - AI-Native Era of Observability: How You Can Get Started Today (AIM220)

AWS re:Invent 2025 - AI-Native Era of Observability: How You Can Get Started Today (AIM220)

The AI-Native Era of Observability: How to Get Started Today

The Evolving Software Landscape

  • Software complexity has increased dramatically in recent years
  • Monolithic architectures have given way to microservices, multi-cloud, and serverless environments
  • This has made it much more difficult to understand and manage software systems

The Observability Challenge

  • Telemetry data (logs, metrics, traces) has exploded, growing 100x in just 8 years
  • More data does not necessarily mean better observability or reliability
  • Relying on dashboards, alerts, and manual investigation has become overwhelming
  • 92% of dashboards are only used for 1 week, yet more are continually added

The Limitations of Traditional Observability

  • Observability is a passive term - the goal should be reliability, stability, and business insights
  • Companies have invested heavily in observability tools, but have not seen commensurate improvements in reliability
  • The data growth outpaces the ability to effectively manage and derive value from it

Introducing Oolie: The Autonomous Observability Agent

  • Oolie is the world's first autonomous observability agent, powered by advanced AI and machine learning
  • Oolie leverages Coralogix's massive observability data platform to:
    • Automatically investigate issues and incidents
    • Correlate disparate data sources to identify root causes
    • Eliminate noise and focus on the most relevant signals
    • Provide actionable insights and recommended fixes

How Oolie Works

  • Oolie is not a single AI model, but a team of specialized agents powered by small language models (SLMs)
  • These agents collaborate to conduct investigations, analyze data, and generate insights
  • Oolie learns the specifics of the customer's observability data and environment to provide context-aware analysis
  • The key components are the model, system prompt, agent architecture, knowledge base, and evaluation/evolution

Real-World Results

  • Example: Oolie helped a customer with a notification service experiencing random latency spikes
    • Oolie identified connections to other services and the underlying RDS database
    • Oolie found that unindexed tables in the RDS database were causing the issues
    • The fix was straightforward once the root cause was identified

The Future of Observability

  • Oolie represents a paradigm shift, moving from linear observability improvements to exponential gains in reliability
  • The next steps are proactive alerting and remediation, followed by the "holy grail" of preventive observability
  • Oolie is now generally available to all Coralogix customers, with the proactive capabilities coming in the next few months

Key Takeaways

  • Software complexity has outpaced traditional observability approaches
  • Autonomous, AI-powered observability can provide 10x improvements in reliability
  • Oolie is a pioneering solution that leverages specialized AI agents to investigate, analyze, and resolve issues
  • Adopting AI-native observability is crucial to keep pace with the evolving software landscape

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.