Quick Start Observability for Amazon Kinesis Data Streams
Thank you!
We got your information.
Coralogix Extension For Amazon Kinesis Data Streams Includes:
Dashboards - 1
Gain instantaneous visualization of all your Amazon Kinesis Data Streams data.
Alerts - 6
Stay on top of Amazon Kinesis Data Streams key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.
Amazon Kinesis Data Stream - Consumption Delay
This alert is specifically designed to monitor and identify potential issues with stream consumption delay, such as high data volume, slow consumer processing, or exceeded throughput limits. This ensures timely processing of data in your Kinesis stream. This alert triggers when the stream consumption delay exceeds 5000 milliseconds (5 second). Customization Guidance: - Threshold: The default threshold is set to trigger when the consumption delay exceeds 5000 milliseconds over the last 10 minutes. Depending on your specific use case and expected processing times, this threshold can be adjusted to better align with your operational standards and service requirements. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Shorter intervals can be used for high-traffic, mission-critical applications to detect issues more quickly. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the causes of the stream consumption delay. Potential causes include high data volume, slow processing by consumer applications, exceeded provisioned throughput, or network issues. Ensure that your consumer applications are operating efficiently and that the Kinesis stream is adequately provisioned. Check the health and performance of the Kinesis service to rule out any ongoing issues or outages. Review any recent changes to your Kinesis stream configuration or consumer applications that could affect data processing.
Amazon Kinesis Data Stream - Write Throughput Exceeded
This alert is specifically designed to monitor and identify potential issues with write throughput in your Kinesis stream. It ensures that your PutRecord and PutRecords operations remain within the provisioned capacity to avoid throttling and potential data loss. This alert triggers when the write throughput limits for the Kinesis stream are exceeded. Customization Guidance: - Threshold: The default threshold is set to trigger when the number of write operations exceeding the provisioned throughput limit is greater than 5 over the last 10 minutes. Depending on your specific use case and expected traffic, this threshold can be adjusted to better align with your operational standards and service requirements. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Shorter intervals can be used for high-traffic, mission-critical applications to detect issues more quickly. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the causes of the write throughput exceedance. Potential causes include a spike in data volume, increased write operations, or suboptimal shard configuration. Ensure that your Kinesis stream is appropriately provisioned and consider increasing the shard count if necessary. Check the health and performance of the Kinesis service to rule out any ongoing issues or outages. Review any recent changes to your Kinesis stream configuration or producing applications that could affect throughput.
Amazon Kinesis Data Stream - Read Throughput Exceeded
This alert is specifically designed to monitor and identify potential issues with read throughput in your Kinesis stream. It ensures that your GetRecords operations remain within the provisioned capacity to avoid throttling and potential data loss. This alert triggers when the read throughput limits for the Kinesis stream are exceeded. Customization Guidance: - Threshold: The default threshold is set to trigger when the number of read operations exceeding the provisioned throughput limit is greater than 5 over the last 10 minutes. Depending on your specific use case and expected traffic, this threshold can be adjusted to better align with your operational standards and service requirements. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Shorter intervals can be used for high-traffic, mission-critical applications to detect issues more quickly. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the causes of the read throughput exceedance. Potential causes include a spike in data consumption, increased read operations, or suboptimal shard configuration. Ensure that your Kinesis stream is appropriately provisioned and consider increasing the shard count if necessary. Check the health and performance of the Kinesis service to rule out any ongoing issues or outages. Review any recent changes to your Kinesis stream configuration or consuming applications that could affect throughput.
Amazon Kinesis Data Stream - GetRecords Average Latency
This alert is specifically designed to monitor and identify potential issues with the average latency of the GetRecords operations in your Kinesis stream. It ensures that your data retrieval processes are performing efficiently to maintain optimal application performance. This alert triggers when the average latency of the GetRecords operations exceeds 500 milliseconds. Customization Guidance: - Threshold: The default threshold is set to trigger when the average latency of GetRecords operations exceeds 500 milliseconds over the last 10 minutes. Depending on your specific use case and expected performance, this threshold can be adjusted to better align with your operational standards and service requirements. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Shorter intervals can be used for high-traffic, mission-critical applications to detect issues more quickly. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the causes of the increased latency. Potential causes include high data volume, slow processing by consumer applications, or network issues. Ensure that your Kinesis stream and consuming applications are optimized for performance. Check the health and performance of the Kinesis service to rule out any ongoing issues or outages. Review any recent changes to your Kinesis stream configuration or consuming applications that could affect latency.
Amazon Kinesis Data Stream - PutRecords Average Latency
This alert is specifically designed to monitor and identify potential issues with the average latency of the PutRecords operations in your Kinesis stream. It ensures that your data ingestion processes are performing efficiently to maintain optimal application performance. This alert triggers when the average latency of the PutRecords operations exceeds 500 milliseconds. Customization Guidance: - Threshold: The default threshold is set to trigger when the average latency of PutRecords operations exceeds 500 milliseconds over the last 10minutes. Depending on your specific use case and expected performance, this threshold can be adjusted to better align with your operational standards and service requirements. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Shorter intervals can be used for high-traffic, mission-critical applications to detect issues more quickly. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the causes of the increased latency. Potential causes include high data volume, slow processing by producing applications, or network issues. Ensure that your Kinesis stream and producing applications are optimized for performance. Check the health and performance of the Kinesis service to rule out any ongoing issues or outages. Review any recent changes to your Kinesis stream configuration or producing applications that could affect latency.
Amazon Kinesis Data Stream - PutRecords Error Rate
This alert is specifically designed to monitor and identify potential issues with the error rates of the PutRecords operations in your Kinesis stream. It ensures that your data ingestion processes are reliable and that errors are promptly detected and addressed to maintain optimal application performance. This alert triggers when the error rate of the PutRecords operations exceeds 1% over the last 10 minutes. Customization Guidance: - Threshold: The default threshold is set to trigger when the error rate of PutRecords operations exceeds 1% over the last 10 minutes. Depending on your specific use case and expected error tolerance, this threshold can be adjusted to better align with your operational standards and service requirements. - Monitoring Period: The monitoring period is set to 10 minutes but can be adjusted to shorter or longer intervals based on traffic patterns and the criticality of data processing. Shorter intervals can be used for high-traffic, mission-critical applications to detect issues more quickly. - Notification Frequency: Adjust the frequency of this alert to balance responsiveness and alert fatigue. Tune it according to the criticality of continuous, uninterrupted data processing for your service. Action: If this alert is triggered, investigate the causes of the increased error rate. Potential causes include high data volume, network issues, or issues with the producing applications. Ensure that your Kinesis stream and producing applications are optimized for performance and reliability. Check the health and performance of the Kinesis service to rule out any ongoing issues or outages. Review any recent changes to your Kinesis stream configuration or producing applications that could affect the error rate.
Integration
Learn more about Coralogix's out-of-the-box integration with Amazon Kinesis Data Streams in our documentation.