Real-time AI observability is here - introducing Coralogix's AI Center

Learn more

Quick Start Observability for Amazon Kinesis Data Firehose

thank you

Thank you!

We got your information.

Amazon Kinesis Data Firehose
Amazon Kinesis Data Firehose icon

Coralogix Extension For Amazon Kinesis Data Firehose Includes:

Dashboards - 1

Gain instantaneous visualization of all your Amazon Kinesis Data Firehose data.

Amazon Kinesis Data Firehose Service
Amazon Kinesis Data Firehose Service

Alerts - 19

Stay on top of Amazon Kinesis Data Firehose key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.

Partition Count Limit Exceeded

This alert activates when the partition count surpasses the predefined limit. It tracks the PartitionCountExceeded metric, which generates a value of 1 when the partition count limit is breached and 0 when it remains within acceptable parameters. This alert triggers when the metric value reaches 1. Customization Guidance: - Threshold: The default threshold is set to 1 to detect when the partition count limit is breached. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of limit breaches without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate and reduce the partition count to ensure it remains within acceptable limits. This may involve optimizing partition usage, increasing partition limits, or redistributing data to balance the load.

JQ Processing Duration Exceeded

This alert monitors the execution time of JQ expressions within the JQ Lambda function. It tracks the JQProcessing.Duration metric, which measures the amount of time it takes to execute a JQ expression. Prompt action is required if the execution time exceeds acceptable limits to ensure optimal performance and efficiency of the Lambda function. This alert triggers when the average JQ Processing Duration exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when the average JQ Processing Duration exceeds 1000 milliseconds. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of prolonged execution times without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the prolonged execution time. This may involve optimizing the JQ expressions, reviewing the Lambda function's configuration, or scaling resources appropriately to handle the load.

KMS Key Access Denial Detected

This alert tracks the number of instances where the service encounters a KMSAccessDeniedException for the delivery stream. It monitors the KMSKeyAccessDenied metric, which counts the occurrences of access denial by the KMS key. Immediate action is required to resolve access issues and ensure uninterrupted operation of the delivery stream. This alert triggers when the counts the occurrences of access denial exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when counts the occurrences of access denial exceeds 2. Set the threshold value to detect when the number of access denials exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of access issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the access denial. This may involve checking the permissions of the KMS key, ensuring that the delivery stream has the necessary access rights, and updating any policies or roles associated with the KMS key.

KMS Key Disabled Exception Detected

This alert tracks the number of instances where the service encounters a KMSDisabledException for the delivery stream. It monitors the KMSKeyDisabled metric, which counts the occurrences of access denial due to a disabled KMS key. Immediate action is required to resolve this issue and ensure the continuous operation of the delivery stream. This alert triggers when the counts the occurrences of access denial exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when counts the occurrences of access denial exceeds 2. Set the threshold value to detect when the number of KMS key disabled exceptions exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of disabled key issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the KMS key being disabled. This may involve verifying the status of the KMS key, re-enabling the key if necessary, and reviewing any policies or configurations that might have led to the key being disabled.

KMS Key Invalid State Exception Detected

This alert tracks the number of instances where the service encounters a KMSInvalidStateException for the delivery stream. It monitors the KMSKeyInvalidState metric, which counts the occurrences of invalid state errors for the KMS key. Immediate action is required to address this issue and ensure the proper functioning of the delivery stream. This alert triggers when the number of invalid state errors exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when number of invalid state errors exceeds 2. Set the threshold value to detect when the number of invalid state exceptions exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of invalid state issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the KMS key being in an invalid state. This may involve checking the KMS key's status, correcting any misconfigurations, and ensuring that the key is in the appropriate state for use by the delivery stream.

KMS Key Not Found Exception Detected

This alert tracks the number of instances where the service encounters a KMSNotFoundException for the delivery stream. It monitors the KMSKeyNotFound metric, which counts the occurrences of the KMS key not being found. Immediate action is required to resolve this issue and ensure the proper operation of the delivery stream. This alert triggers when the number of exceptions exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when number of exceptions exceeds 2. Set the threshold value to detect when the number of not found exceptions exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of missing key issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the KMS key not being found. This may involve verifying the existence and correct configuration of the KMS key, ensuring that the key is properly associated with the delivery stream, and checking for any issues related to key deletion or misconfiguration.

DescribeDeliveryStream Operation Latency

This alert monitors the time taken for each DescribeDeliveryStream operation, measured over the specified time period. It tracks the DescribeDeliveryStream.Latency metric, which measures the latency of these operations. Immediate action is required if the latency exceeds acceptable limits to ensure optimal performance of the delivery stream. This alert triggers when average latency of the DescribeDeliveryStream operations exceeds 500 milliseconds. Customization Guidance: - Threshold: The default threshold is set to trigger when average latency of the DescribeDeliveryStream operations exceeds 500 milliseconds. Set the threshold value to detect when the latency of the DescribeDeliveryStream operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of latency issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the increased latency. This may involve reviewing the performance of the DescribeDeliveryStream operations, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

ListDeliveryStreams Operation Latency

This alert monitors the time taken for each ListDeliveryStreams operation, measured over the specified time period. It tracks the ListDeliveryStreams.Latency metric, which measures the latency of these operations. Immediate action is required if the latency exceeds acceptable limits to ensure the optimal performance of the delivery stream. This alert triggers when average latency of the ListDeliveryStreams operations exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when average latency of the ListDeliveryStreams operations exceeds 1000 milliseconds (1 second). Set the threshold value to detect when the latency of the ListDeliveryStreams operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of latency issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the increased latency. This may involve reviewing the performance of the ListDeliveryStreams operations, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

PutRecord Operation Latency

This alert monitors the time taken for each PutRecord operation, measured over the specified time period. It tracks the PutRecord.Latency metric, which measures the latency of these operations. Immediate action is required if the latency exceeds acceptable limits to ensure the optimal performance of the delivery stream. This alert triggers when average latency of the PutRecord operations exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when average latency of the PutRecord operations exceeds 1000 milliseconds (1 second). Set the threshold value to detect when the latency of the PutRecord operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of latency issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the increased latency. This may involve reviewing the performance of the PutRecord operations, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

PutRecordBatch Operation Latency

This alert monitors the time taken for each PutRecordBatch operation, measured over the specified time period. It tracks the PutRecordBatch.Latency metric, which measures the latency of these operations. Immediate action is required if the latency exceeds acceptable limits to ensure the optimal performance of the delivery stream. This alert triggers when average latency of the PutRecordBatch operations exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when average latency of the PutRecordBatch operations exceeds 1000 milliseconds (1 second). Set the threshold value to detect when the latency of the PutRecordBatch operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of latency issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the increased latency. This may involve reviewing the performance of the PutRecordBatch operations, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

UpdateDeliveryStream Operation Latency

This alert monitors the time taken for each UpdateDeliveryStream operation, measured over the specified time period. It tracks the UpdateDeliveryStream.Latency metric, which measures the latency of these operations. Immediate action is required if the latency exceeds acceptable limits to ensure the optimal performance of the delivery stream. This alert triggers when average latency of the UpdateDeliveryStream operations exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when average latency of the UpdateDeliveryStream operations exceeds 1000 milliseconds (1 second). Set the threshold value to detect when the latency of the UpdateDeliveryStream operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of latency issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the increased latency. This may involve reviewing the performance of the UpdateDeliveryStream operations, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

GetRecords Operation Throttling Detected

This alert tracks the total number of instances where the GetRecords operation is throttled when the data source is a Kinesis data stream. It monitors the ThrottledGetRecords metric, which counts the occurrences of throttling. Immediate action is required to resolve this issue to ensure the smooth and efficient operation of the delivery stream. This alert triggers when throttled count for operation GetRecords exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when awhen throttled count for operation GetRecords exceeds 2. Set the threshold value to detect when the number of throttled GetRecords operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of throttling issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the throttling. This may involve increasing the throughput limits, optimizing the configuration of the Kinesis data stream, or redistributing the data load to prevent throttling.

DescribeStream Operation Throttling Detected

This alert tracks the total number of instances where the DescribeStream operation is throttled when the data source is a Kinesis data stream. It monitors the ThrottledDescribeStream metric, which counts the occurrences of throttling. Immediate action is required to resolve this issue to ensure the smooth and efficient operation of the delivery stream. This alert triggers when throttled count for operation DescribeStream exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when awhen throttled count for operation DescribeStream exceeds 2. Set the threshold value to detect when the number of throttled DescribeStream operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of throttling issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the throttling. This may involve increasing the throughput limits, optimizing the configuration of the Kinesis data stream, or redistributing the data load to prevent throttling.

GetShardIterator Operation Throttling Detected

This alert tracks the total number of instances where the GetShardIterator operation is throttled when the data source is a Kinesis data stream. It monitors the ThrottledGetShardIterator metric, which counts the occurrences of throttling. Immediate action is required to resolve this issue to ensure the smooth and efficient operation of the delivery stream. This alert triggers when throttled count for operation GetShardIterator exceeds 2. Customization Guidance: - Threshold: The default threshold is set to trigger when awhen throttled count for operation GetShardIterator exceeds 2. Set the threshold value to detect when the number of throttled GetShardIterator operations exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of throttling issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the throttling. This may involve increasing the throughput limits, optimizing the configuration of the Kinesis data stream, or redistributing the data load to prevent throttling.

Amazon Lambda Function Execution Delay Alert

This alert monitors the time it takes for each Lambda function invocation performed by Firehose. It tracks the ExecuteProcessing.Duration metric, which measures the duration of these invocations. Immediate action is required if the execution time exceeds acceptable limits to ensure the optimal performance of the delivery stream. This alert triggers when average latency of the Lambda Function Execution exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when average latency of the Lambda Function Execution exceeds 1000 milliseconds (1 second). Set the threshold value to detect when the execution duration of the Lambda function exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of execution time issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the prolonged execution time. This may involve optimizing the Lambda function code, reviewing the function's configuration, and ensuring that the system resources are adequate to handle the load.

Kinesis Data Stream Lag Detected

This alert monitors the lag time in milliseconds for the Kinesis data stream. It tracks the KinesisMillisBehindLatest metric, which indicates the number of milliseconds that the last read record is behind the newest record in the Kinesis data stream. Immediate action is required if the lag exceeds acceptable limits to ensure timely data processing and optimal performance of the delivery stream. This alert triggers when average lag time exceeds 1000 milliseconds (1 second). Customization Guidance: - Threshold: The default threshold is set to trigger when average lag time exceeds exceeds 1000 milliseconds (1 second). Set the threshold value to detect when the lag time exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of execution time issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the increased lag. This may involve reviewing the performance of the data stream, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

Data Freshness Alert for S3 Backup

This alert monitors the age of the oldest record in Amazon Data Firehose that is pending backup. It tracks the BackupToS3.DataFreshness metric, which measures the time elapsed from when a record enters Amazon Data Firehose to the current time. Any record older than this age has been delivered to the Amazon S3 bucket for backup. Amazon Data Firehose emits this metric when data transformation is enabled for Amazon S3 or Amazon Redshift destinations. Immediate action is required if the data freshness exceeds acceptable limits to ensure timely data delivery and backup. This alert triggers when average age of the oldest record exceeds exceeds 300 seconds (5 minute). Customization Guidance: - Threshold: The default threshold is set to trigger when average lag time exceeds exceeds 300 seconds (5 minute). Set the threshold value to detect when the age of the oldest record exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of execution time issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the delay in data backup. This may involve reviewing the performance of the data transformation process, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

Data Freshness Alert for S3 Delivery

This alert monitors the age of the oldest record in Amazon Data Firehose that is pending delivery. It tracks the DeliveryToS3.DataFreshness metric, which measures the time elapsed from when a record enters Amazon Data Firehose to the current time. Any record older than this age has been delivered to the S3 bucket. Immediate action is required if the data freshness exceeds acceptable limits to ensure timely data delivery. This alert triggers when average age of the oldest record exceeds exceeds 300 seconds (5 minute). Customization Guidance: - Threshold: The default threshold is set to trigger when average lag time exceeds exceeds 300 seconds (5 minute). Set the threshold value to detect when the age of the oldest record exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of execution time issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the delay in data delivery. This may involve reviewing the performance of the delivery process, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

Data Freshness Alert for HTTP Endpoint Delivery

This alert monitors the age of the oldest record in Amazon Data Firehose that is pending delivery to an HTTP endpoint. It tracks the DeliveryToHttpEndpoint.DataFreshness metric, which measures the time elapsed from when a record enters Amazon Data Firehose to the current time. Immediate action is required if the data freshness exceeds acceptable limits to ensure timely data delivery to the HTTP endpoint. This alert triggers when average age of the oldest record exceeds exceeds 90 seconds. Customization Guidance: - Threshold: The default threshold is set to trigger when average lag time exceeds exceeds 90 seconds. Set the threshold value to detect when the age of the oldest record exceeds acceptable limits. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Configure the monitoring period to an appropriate duration to ensure timely detection of execution time issues without causing excessive alerting. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the cause of the delay in data delivery. This may involve reviewing the performance of the delivery process, optimizing the configuration, and ensuring that the system resources are adequate to handle the load.

Integration

Learn more about Coralogix's out-of-the-box integration with Amazon Kinesis Data Firehose in our documentation.

Read More
Schedule Demo

Enterprise-Grade Solution