Handle millions of observability events with Apache Flink & Prometheus (OPN406)

Here is a detailed summary of the key takeaways from the video transcription:

Introduction to Observability

  • Observability gives visibility into a system, allowing for real-time troubleshooting and better customer experience.
  • Observability goes beyond just monitoring IT infrastructure and application - it's about observing the entire business.
  • The three pillars of observability are logs, traces, and metrics. This talk focuses on metrics and time series data.

Understanding Prometheus

  • Prometheus is a multi-dimensional time series database used for real-time visualization, alerting, and integration with various systems.
  • Prometheus is designed for operational metrics and provides high availability and data freshness over consistency.
  • Prometheus can be used for a variety of use cases beyond just IT infrastructure monitoring, such as IoT, manufacturing, and telecommunications.
  • Prometheus supports two main ways of ingesting data: pull-based scraping and push-based remote write.

Challenges with Prometheus for Observability

  • When dealing with high-cardinality and high-frequency data (e.g., IoT devices), Prometheus may face performance challenges when querying and storing the data.
  • There may be a need for pre-processing and enrichment of the raw data before writing to Prometheus to reduce cardinality and improve query performance.

Introducing Apache Flink

  • Apache Flink is a framework and distributed processing engine for stateful computation over unbounded and bounded data streams.
  • Flink provides a unified API for processing both bounded and unbounded data, making it well-suited for stream processing use cases.
  • Flink has a rich ecosystem of connectors that allow reading from and writing to various systems, including databases, message queues, and file systems.

Combining Flink and Prometheus

  • The built-in Flink Prometheus reporter is not suitable for high-scale observability use cases as it is designed to monitor the Flink application itself, not process external observability data.
  • Implementing a custom Prometheus remote write integration with Flink is possible but requires significant effort to handle batching, error handling, and other complexities.

The Flink Prometheus Connector

  • The Flink Prometheus connector is a new addition to the Flink ecosystem that simplifies the integration between Flink and Prometheus.
  • The connector fully implements the Prometheus remote write specification, optimizing for high-throughput writes and horizontal scalability.
  • The connector handles batching, retrying, and ordering of the data written to Prometheus, making it a suitable solution for high-scale observability use cases.

Demo: Connected Vehicles Use Case

  • The demo showcases a use case of processing observability data from a fleet of connected vehicles using Flink and Prometheus.
  • The pre-processor Flink application performs data enrichment, aggregation, and cardinality reduction before writing the processed metrics to Prometheus.
  • Compared to the raw event writer approach that directly writes to Prometheus, the pre-processor approach provides better performance and cost-efficiency when querying the data in Prometheus.

Conclusion

  • Combining Flink and Prometheus, enabled by the Flink Prometheus connector, unlocks the ability to observe and monitor widely distributed resources at scale, such as IoT devices, vehicles, or other systems.
  • The Flink Prometheus connector allows for efficient pre-processing and enrichment of observability data before writing to Prometheus, improving query performance and cost-effectiveness.
  • The resources provided (documentation, demo code, and managed service links) can help developers get started with this solution.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us