Here is a detailed summary of the video transcription in markdown format:
Observability and the Full Picture of Application Monitoring
Lessons from a 1986 Commercial
- In 1986, a commercial that won a Golden Lion advertisement award depicted a scenario where a young man was initially perceived as a criminal but was actually saving an elderly man's life.
- The commercial conveyed the message that without a full picture, one can draw the wrong conclusions from partial perspectives.
Building a Full Picture of Application Observability at AWS
- At AWS, builders take pride in operational excellence, which is a big part of their business results.
- To achieve this, AWS engineers have been building a "full picture" of application observability, allowing them to make the right decisions about their services.
Cloud-First Observability
- The observability built by AWS must be:
- Resilient and available when other services are down.
- Efficient at scale to support a variety of scales.
Application Signals in Amazon CloudWatch
- Last year, AWS launched a "full picture" tool in CloudWatch called Application Signals.
- This tool embodies operational practices to help engineers be efficient and productive.
Understanding the Full Picture
Application Signals provides three entry points to understand the full picture:
- Service Level Objectives (SLOs): Allows you to define a set of things that matter to your users, such as API latency or object creation ability.
- Service Dashboard: Provides a full overview across all your services, including information about dependencies.
- Service Map: Allows you to see the full topology of your application and any health issues.
Diagnosing Anomalies
Once an anomaly is spotted, you can use Application Signals to:
- Identify the anomaly, such as a spike in P99 metrics.
- Get the precise transactions causing the anomaly.
- Determine the root cause of the high-latency transactions.
Incorporating Additional Perspectives
Application Signals allows you to incorporate other perspectives, such as:
- Synthetic, outside-in monitoring to model customer workflows.
- Real user monitoring to understand issues experienced by end-users.
- Resource-level information (e.g., container or Lambda metrics) to identify resource-level stresses.
New Capabilities in Application Signals
- Transaction Analytics: Captures 100% of application transactions for analysis and to identify unique anomalies.
- Integration with Amazon DevOps Guru: Quickly identifies the root cause of service-level objective breaches.
- Expanded support for open-source observability with OpenTelemetry.
- Enhancements for volume-based SLOs, runtime metrics, language coverage, database support, and generative AI analytics.
Demonstration of Application Signals Features
The presentation includes a live demonstration showcasing various use cases and capabilities of Application Signals, such as:
- Troubleshooting issues in a Lambda-based appointment service.
- Correlating application-level and database-level performance.
- Monitoring the usage of generative AI models in an application.
- Providing efficient customer support by quickly identifying the root cause of issues.
- Automating the root cause analysis using Amazon DevOps Guru.
Testimonial from PBS
- Brian Link, the Director of Technical Operations at PBS, shared how Application Signals and CloudWatch have helped PBS improve their application observability and business metrics.
- Examples include monitoring the performance of their Station Video Portal, tracking localization issues, and gaining visibility into their recommendation engine and donation tracking.