AWS re:Invent 2024 -Improve throughput and monitoring of serverless streaming workloads (SVS217-NEW)
Improving Throughput and Monitoring of Serverless Streaming Workloads
Introduction
Anton, a principal solutions architect for serverless at AWS, is presenting on how to improve throughput and monitoring of serverless streaming workloads.
The session covers topics such as Lambda concurrency, event source mappings, common techniques to optimize throughput, and specific techniques for Kinesis and Apache Kafka.
Attendees are encouraged to take pictures of the slides, as a QR code will be provided at the end with links to all the content covered in the session.
Understanding Lambda Concurrency
Lambda functions are executed in execution environments, which can be reused for subsequent invocations.
The number of concurrent execution environments is the metric called Lambda concurrency.
Lambda can scale up to 1,000 execution environments immediately and an additional 1,000 every 10 seconds.
Lambda emits various metrics to CloudWatch, which can be used to monitor the performance of your functions.
Event Source Mappings (ESMs)
ESMs are the components in Lambda that poll data from event sources like Kafka, Kinesis, and SQS, and invoke the Lambda function.
The ESM process involves four steps: polling, filtering, batching, and processing (invoking the Lambda function).
Common Techniques to Optimize Throughput
Parallelize Data Processing:
Utilize Lambda concurrency to increase parallel processing.
Parallelize data processing within your Lambda function code.
Reduce Processing Duration:
Allocate more memory to your Lambda function to increase CPU.
Optimize the code to decrease the duration of your function.
Filter Irrelevant Messages:
Filter out messages that don't need to be processed in the ESM configuration, not in your code.
Batch Messages:
Batch messages before processing them instead of processing them one by one.
Kinesis and Apache Kafka Specifics
Kinesis
Kinesis streams are divided into shards, which define the throughput capacity.
Kinesis supports two modes: fully-automated and manual shard management.
Kinesis consumers can operate in two modes: shared-throughput and enhanced fan-out.
Increasing the number of shards and the parallelization factor can help improve throughput.
Apache Kafka
Kafka has a concept of partitions, similar to Kinesis shards.
Producers use partition keys to distribute records across partitions.
Kafka consumers are polled by the ESM, and the concurrency is scaled based on the number of partitions.
New Capabilities
Provision Mode for Kafka ESM
Allows you to configure a minimum and maximum number of pollers that are always on, providing faster and more predictable performance for spiky workloads.
Out-of-the-box ESM Metrics
Provides insights into the state of ingested messages, such as PolledEventCount, InvokedEventCount, and FailedInvokeCount.
Wrap-up and Next Steps
Summarize the key techniques for improving throughput:
Parallelize data processing
Reduce processing duration
Filter irrelevant messages
Batch messages
Provide recommendations for Kinesis and Kafka specific optimizations.
Encourage attendees to explore the new capabilities and resources provided.
Invite attendees to attend additional sessions and workshops to continue their AWS serverless learning.
Your Digital Journey deserves a great story.
Build one with us.
This website stores cookies on your computer.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.