TalksAccelerate value from data: Migrating from batch to stream processing (ANT324)
Accelerate value from data: Migrating from batch to stream processing (ANT324)
Here's a detailed summary of the video transcription in markdown format, broken down into sections:
Evolution of Data Processing
Humans and businesses have been collecting data for thousands of years, starting with clay tablets in ancient Mesopotamia.
Data production and processing has exploded in the 21st century due to the ubiquity of the internet, smartphones, and e-commerce.
Data is now being produced continuously, from diverse sources, and is being analyzed by many applications within businesses.
Modern Business Needs
Businesses still require reporting, but also need the ability to generate faster insights and power AI/ML capabilities.
Faster insights lead to better and faster decision-making, while AI/ML capabilities can provide differentiation for businesses.
Batch vs. Streaming Processing
Batch processing is useful for powering business reporting and BI tools, but is insufficient for generating faster insights and powering AI/ML.
Streaming processing can provide real-time data processing, enabling faster insights and better support for AI/ML use cases.
Streaming Architecture
Producing and Storing Data:
Use CDC (Change Data Capture) tools like AWS DMS or Debezium to stream data changes from databases to streaming storage like Amazon Kinesis or Apache Kafka.
This reduces load on databases and provides low-latency data ingestion.
Processing Data in Motion:
Use a managed service for Apache Flink to continuously process the streaming data, perform filtering, enrichment, and aggregation.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.