[Live Webinar] Unlocking real-time AI Observability with Coralogix's AI Center

Register Now

Quick Start Observability for Amazon EventBridge

thank you

Thank you!

We got your information.

Amazon EventBridge
Amazon EventBridge icon

Coralogix Extension For Amazon EventBridge Includes:

Dashboards - 1

Gain instantaneous visualization of all your Amazon EventBridge data.

Amazon EventBridge
Amazon EventBridge

Alerts - 5

Stay on top of Amazon EventBridge key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.

Amazon Eventbridge - High Failed Invocation

This alert monitors the number of failed invocations in Amazon EventBridge. High failed invocations can disrupt event-driven workflows and lead to delays or loss of critical functionality in dependent systems. The alert is triggered when the number of failed invocations exceeds 1000 within a 10-minute period. Monitoring this metric helps ensure the reliability of event-driven architectures by identifying issues with event delivery or target configuration. Customization Guidance: - Threshold: Adjust the threshold based on your application's tolerance for failed invocations and the expected volume of events. - Monitoring Period: Modify the monitoring period to capture patterns in invocation failures during peak or critical times. - Notification Frequency: Set a suitable frequency to ensure timely awareness while avoiding alert fatigue. Action: If this alert is triggered, consider investigating the root cause of the failures, such as misconfigured event targets, network issues, or resource constraints. Additionally, review the EventBridge retry policies and error handling mechanisms.

High Put Events Approximate Failed Call Count

This alert monitors the approximate number of failed calls to Amazon EventBridge, indicating issues in delivering events to configured targets. High failed call counts can cause delays or failures in event-driven workflows, potentially affecting downstream systems. The alert is triggered when the approximate failed call count exceeds 50 within a 10-minute period. Monitoring this metric helps ensure the reliability and integrity of event delivery in your architecture by detecting delivery failures early. Customization Guidance: - Threshold: Adjust the threshold based on the typical volume of events and your system's tolerance for failed calls. - Monitoring Period: Tune the monitoring period to align with the frequency and criticality of event traffic. - Notification Frequency: Configure notification frequency to ensure prompt responses while minimizing unnecessary alerts. Action: If this alert is triggered, review EventBridge logs to identify the cause of the failures, such as incorrect target configurations or insufficient permissions. Ensure retries are configured correctly and evaluate the health of target systems.

High Put Event Latency

This alert monitors the latency of **PutEvents** API calls in Amazon EventBridge. High latency can lead to delays in event processing, which may impact the performance of downstream event-driven workflows. The alert is triggered when the latency for PutEvents API calls exceeds 500 milliseconds for more than 10% of requests within a 10-minute period. Monitoring this metric helps ensure timely event ingestion and processing, which is critical for maintaining the responsiveness of event-driven systems. Customization Guidance: - Threshold: Adjust the threshold based on acceptable latency levels for your application's event ingestion requirements. - Monitoring Period: Modify the monitoring period to align with expected traffic patterns and application performance needs. - Notification Frequency: Set notification frequency to promptly detect performance degradation while avoiding excessive alerts during transient spikes. Action: If this alert is triggered, investigate possible causes such as network issues, high API request volumes, or resource constraints in EventBridge. Review the scalability of your event producers and consider optimizing request batching to reduce latency.

Low Triggered Rule Count

This alert monitors the number of triggered rules in Amazon EventBridge. A low triggered rule count may indicate reduced event activity or potential issues with rule configurations or event sources. The alert is triggered when the count of triggered rules falls below 5 within a 10-minute period. Monitoring this metric helps ensure the proper functioning of your event-driven system by identifying anomalies in rule triggering patterns. Customization Guidance: - Threshold: Adjust the threshold based on the expected baseline activity of triggered rules in your EventBridge setup. - Monitoring Period: Modify the monitoring period to align with typical activity windows for event generation and rule execution. - Notification Frequency: Set notification frequency to promptly detect issues while avoiding unnecessary alerts during predictable low-activity periods. Action: If this alert is triggered, check for issues such as misconfigured rules, inactive event sources, or upstream failures in event generation. Verify that expected events are being published to the event bus and that rules are set to match the intended patterns.

Low Invocation Count

This alert monitors the number of invocations in Amazon EventBridge. A low invocation count may indicate reduced event activity, inactive rules, or potential issues with event sources or target configurations. The alert is triggered when the invocation count falls below 10 within a 10-minute period. Monitoring this metric helps ensure the continuity of event-driven workflows by identifying anomalies in event delivery and processing patterns. Customization Guidance: - Threshold: Adjust the threshold based on the expected baseline activity for event invocations in your EventBridge environment. - Monitoring Period: Modify the monitoring period to align with typical activity windows, especially during peak or critical periods. - Notification Frequency: Configure notification frequency to detect issues promptly while avoiding unnecessary alerts during expected low-activity times. Action: If this alert is triggered, review the status of event sources, rules, and targets to identify the cause of the low invocation count. Ensure that expected events are being generated and successfully routed to their targets. Check for misconfigurations or disruptions in upstream systems.

Integration

Learn more about Coralogix's out-of-the-box integration with Amazon EventBridge in our documentation.

Read More
Schedule Demo

Enterprise-Grade Solution