Quick Start Observability for AWS Lambda

thank you

Thank you!

We got your information.

AWS Lambda
AWS Lambda icon

Coralogix Extension For AWS Lambda Includes:

Dashboards - 1

Gain instantaneous visualization of all your AWS Lambda data.

AWS Lambda
AWS Lambda

Alerts - 5

Stay on top of AWS Lambda key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.

More than 10 throttling events in 10 minutes

This alert is specifically designed to monitor and address potential scalability or configuration issues in AWS Lambda by tracking throttling events. Throttling often indicates that the function is hitting service limits, which can impact performance and availability. The alert is activated when the Lambda function experiences more than 5 throttling events within a 10-minute period. Throttling events occur when AWS Lambda limits the function’s execution due to reaching predefined concurrency or rate limits, possibly leading to delayed or denied service. Monitoring these events helps in understanding and managing workload distributions and operational limits effectively. Customization Guidance: - Threshold: The default threshold is set at 5 throttling events in 10 minutes. Depending on the function’s role and expected load, this threshold may be adjusted to better reflect the operational norms and service requirements. - Monitoring Period: The monitoring period can be adjusted to shorter or longer than 10 minutes based on the traffic pattern and criticality of the function. Shorter periods may be used for high-traffic, critical functions to catch issues more rapidly. - Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the function’s uninterrupted operation Action: Should this alert trigger, review the function’s concurrency settings and the code’s execution path. Consider increasing the concurrency limits, optimizing code performance, or employing strategies such as queueing mechanisms to handle peak loads better

Execution Time Exceeded 5 Seconds in 10 minutes

This alert is designed to monitor the average execution time of a Lambda function to identify and address performance issues related to longer processing times, which can affect user experience and resource utilization. The alert is activated when the average execution time of the Lambda function exceeds 5 seconds over the last 10 minutes. Monitoring execution time is crucial for ensuring that the Lambda function operates within its performance and cost-efficiency targets. Extended execution times may indicate issues with the function’s code, external service latencies, or suboptimal resource allocation. Customization Guidance: - Threshold: The default threshold for triggering this alert is an average execution time of 5 seconds. Modify this threshold based on the specific performance expectations and criticality of the function within your application. - Monitoring Period: The standard monitoring period is set at 10 minutes to provide a timely overview of performance trends. This time range should be adjusted based on the expected frequency and variability of function invocations. - Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the function’s uninterrupted operation Action: If this alert triggers, review the detailed execution logs to identify any anomalies or efficiency issues. Consider optimizing the function’s code, increasing allocated resources, or adjusting timeout settings. Further, investigate dependencies on external services that may be contributing to increased latency.

Error Rate Exceeds 5%

This alert is designed to identify and address issues in the execution of a Lambda function by closely monitoring its error rate. This ensures that the function performs reliably and efficiently, maintaining the overall quality of the application. The alert is activated when the execution error rate of the Lambda function exceeds 1% of its total invocations over a designated monitoring period. An elevated error rate can signal issues with the Lambda function’s code, dependencies, or interaction with other services, which could compromise application performance and user experience Customization Guidance: - Threshold: The default threshold is set at an error rate of 1%. Depending on the criticality of the function and historical performance data, adjust the threshold to better suit the application’s tolerance for errors. - Monitoring Period: The period over which errors and invocations are counted should be aligned with the application’s usage patterns. Adjust this period to ensure timely detection of issues without causing undue alert noise. - Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the function’s uninterrupted operation Action: Upon activation, promptly analyze the function’s logs to pinpoint the source of errors. Investigate recent code deployments, changes in configuration, and external service interactions. Based on the findings, implement fixes, test changes, and redeploy if necessary.

Asynchronous Event Drop Detected

This alert is designed to monitor and respond to excessive dropping of asynchronous events in AWS Lambda functions, which can indicate processing issues or insufficient resources. The alert triggers when the number of dropped asynchronous events exceeds 10 within a 10-minute period. Tracking dropped events is crucial for ensuring that all asynchronous tasks are processed as expected. Excessive drops may result from configuration issues, resource limits, or unexpected surges in event volume. Troubleshooting Guidance: - Errors Metric: Check the Errors metric to identify any recent increases in function errors, which could be contributing to processing delays. - Throttles Metric: Review the Throttles metric to determine if concurrency limits are being reached, leading to throttling events that delay event processing. Customization Guidance: - Threshold: The default threshold is set at 10 dropped events in 10 minutes. Adjust this threshold based on the usual event traffic and tolerance for dropped events in your specific application. - Monitoring Period: The standard monitoring period is 10 minutes, which may be modified to be more or less frequent based on the application’s operational needs and event processing characteristics. - Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the function’s uninterrupted operation. Action: Upon activation, examine the Lambda function’s execution logs to identify the root causes of the dropped events. Review and adjust function configurations, such as timeout settings, memory allocation, and concurrency limits, to prevent future drops.

AWS Lambda Asynchronous Event Age Exceeded 10 Minutes

This alert helps ensure that asynchronous events are processed in a timely manner by monitoring the age of these events in the Lambda function’s queue. The alert is activated when the median age of the oldest asynchronous event in the Lambda function’s queue exceeds 10 minutes. Monitoring the age of asynchronous events helps in identifying delays or backlogs in event processing. A high median age can indicate issues with event handling or resource allocation. Troubleshooting Guidance: - Errors Metric: Check the Errors metric to identify any recent increases in function errors, which could be contributing to processing delays. - Throttles Metric: Review the Throttles metric to determine if concurrency limits are being reached, leading to throttling events that delay event processing Customization Guidance: - Threshold Adjustment: The default threshold is set at 10 minutes. Adjust this based on the criticality of event processing times and the expected throughput of your application. - Monitoring Period: Consider the typical processing time and event volume to determine if a shorter or longer monitoring period is appropriate. - Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the function’s uninterrupted operation Action: Should this alert trigger, investigate the underlying causes of increased queue times. Possible actions include optimizing code for faster execution, adjusting concurrency settings, or increasing resource allocation.

Integration

Learn more about Coralogix's out-of-the-box integration with AWS Lambda in our documentation.

Read More
Schedule Demo

Enterprise-Grade Solution