Quick Start Observability for Amazon DocumentDB
Thank you!
We got your information.
Coralogix Extension For Amazon DocumentDB Includes:
Dashboards - 1
Gain instantaneous visualization of all your Amazon DocumentDB data.
Alerts - 8
Stay on top of Amazon DocumentDB key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.
High CPU Utilization
This alert monitors and detects sustained high CPU usage on DocumentDB clusters, ensuring workloads operate within optimal performance levels. It helps identify resource bottlenecks and potential performance degradation caused by excessive load or inefficient queries. An alert triggers if the CPU utilization exceeds 80% for more than 10 minutes. This condition may indicate high computational demand, inefficient queries, or the need for instance scaling. Customization Guidance: - Threshold: The default threshold is set to trigger an alert if CPU utilization exceeds 80% for 10 minutes. Adjust the threshold based on workload intensity and acceptable performance levels for your instances. - Monitoring Period: Adjust the monitoring period to shorter or longer durations depending on the sensitivity and workload patterns. Use shorter durations for critical workloads to catch issues rapidly. - Notification Frequency: Configure notifications to balance responsiveness and alert noise. Higher notification frequency is recommended for critical workloads. Action: If this alert triggers, analyze query execution and workload distribution using performance insights or logs. Optimize slow queries and reduce computational demand. Consider scaling the instance size or adding replicas if necessary to handle the load.
Low Free Memory
This alert monitors available memory to ensure the DocumentDB instance maintains sufficient resources for smooth operation. It helps detect memory shortages that may lead to performance issues or application failures. An alert triggers if the free memory drops below 1 GB for more than 10 minutes. This condition may indicate memory-intensive workloads, poor query optimization, or insufficient instance sizing. Customization Guidance: - Threshold: The default threshold is set to alert if free memory drops below 1 GB for 10 minutes. Adjust this threshold based on your instance configuration and workload requirements. - Monitoring Period: Adjust the monitoring period to shorter or longer durations depending on the sensitivity and workload patterns. Use shorter durations for critical workloads to catch issues rapidly. - Notification Frequency: Configure notifications to balance responsiveness and alert noise. Higher notification frequency is recommended for critical workloads. Action: If this alert triggers, examine query workloads and memory-intensive operations. Optimize queries or workloads consuming excessive memory. Consider scaling the instance size or distributing the workload across additional instances.
High Replica Lag
This alert monitors the lag between the primary and replica instances in DocumentDB clusters, ensuring replicas remain closely synchronized. It helps identify replication delays that may impact read performance or data consistency. An alert triggers if the maximum replica lag exceeds 1 second for more than 10 minutes. This condition may indicate high write load, replication issues, or underprovisioned resources. Customization Guidance: - Threshold: The default threshold is set to alert if replica lag exceeds 1 second for 10 minutes. Adjust the threshold based on acceptable latency levels for your application. - Monitoring Period: Adjust the monitoring period to shorter or longer durations depending on the sensitivity and workload patterns. Use shorter durations for critical workloads to catch issues rapidly. - Notification Frequency: Configure notifications to balance responsiveness and alert noise. Higher notification frequency is recommended for critical workloads. Action: If this alert triggers, check the write workload and evaluate if the primary instance is experiencing bottlenecks. Investigate replication logs and metrics to identify root causes. Consider optimizing write queries or scaling resources to reduce replication lag.
High Disk Queue Depth
This alert monitors the number of outstanding read/write requests on the disk, ensuring the disk can handle the workload without creating significant bottlenecks. It helps identify potential I/O contention or underperforming storage. An alert triggers if the disk queue depth exceeds 64 for more than 10 minutes. This condition may indicate high I/O demand or suboptimal query performance. Customization Guidance: - Threshold: The default threshold is set to alert if disk queue depth exceeds 64 for 10 minutes. Adjust the threshold based on I/O workload and acceptable performance levels. - Monitoring Period: Adjust the monitoring period to shorter or longer durations depending on the sensitivity and workload patterns. Use shorter durations for critical workloads to catch issues rapidly. - Notification Frequency: Configure notifications to balance responsiveness and alert noise. Higher notification frequency is recommended for critical workloads. Action: If this alert triggers, analyze I/O metrics and query patterns to identify bottlenecks. Optimize queries or workloads with high I/O demands. Consider scaling storage or redistributing workloads to improve performance.
Low Buffer Cache Hit Ratio
This alert monitors the buffer cache hit ratio to ensure DocumentDB efficiently serves read requests from the buffer cache, reducing reliance on slower disk I/O operations. It helps detect inefficient memory usage or query patterns. An alert triggers if the buffer cache hit ratio drops below 90% for more than 10 minutes. This condition may indicate memory pressure, inefficient query patterns, or high disk I/O demand. Customization Guidance: - Threshold: The default threshold is set to alert if the buffer cache hit ratio drops below 90% for 10 minutes. Adjust this threshold based on workload characteristics and memory availability. - Monitoring Period: Adjust the monitoring period based on workload dynamics and application sensitivity to performance issues. Shorter durations are better for memory-intensive workloads. - Notification Frequency: Configure notifications to balance timely response with noise. High notification frequencies are recommended for memory-sensitive applications. Action: If this alert triggers, evaluate memory usage and query patterns to identify inefficiencies. Optimize queries to reduce the need for full scans and ensure proper indexing. Consider scaling the instance size to increase available memory and improve caching.
Low Index Buffer Cache Hit Ratio
This alert monitors the index buffer cache hit ratio to ensure that DocumentDB efficiently uses the buffer cache for indexed queries. It helps detect performance issues caused by inefficient memory utilization or high reliance on disk I/O for indexed data access. An alert triggers if the index buffer cache hit ratio drops below 90% for more than 10 minutes. This condition may indicate memory pressure, poorly designed indexes, or increased query demand. Customization Guidance: - Threshold: The default threshold is set to alert if the index buffer cache hit ratio drops below 90% for 10 minutes. Adjust this threshold based on workload characteristics and memory requirements for indexed queries. - Monitoring Period: Adjust the monitoring period to shorter or longer durations based on workload dynamics and application sensitivity to query performance. Shorter periods are recommended for critical workloads relying heavily on indexed queries. - Notification Frequency: Configure notifications to balance timely response with noise. High notification frequencies are recommended for memory-sensitive applications. Action: If this alert triggers, review query patterns and ensure proper indexing to optimize index cache utilization. Analyze memory usage and consider scaling up the instance size to increase available memory. Reassess your indexing strategy to reduce the impact of cache misses and optimize query execution.
High Read Latency
This alert monitors read latency to ensure DocumentDB read operations remain efficient and responsive. It helps detect performance issues caused by high read demand, suboptimal query execution, or resource bottlenecks. An alert triggers if the average read latency exceeds 20 ms for more than 10 minutes. This condition may indicate high read demand, inefficient indexes, or underprovisioned resources. Customization Guidance: - Threshold: The default threshold is set to alert if read latency exceeds 20 ms for 10 minutes. Adjust this threshold based on acceptable performance levels for your application. - Monitoring Period: Adjust the monitoring period based on workload patterns and application requirements. Shorter durations are suitable for latency-sensitive applications. - Notification Frequency: Configure notifications based on the criticality of low-latency read operations. Higher frequencies are recommended for latency-sensitive environments. Action: If this alert triggers, analyze query patterns and indexes for inefficiencies. Optimize slow queries and ensure indexes are used effectively. Scale storage or add replicas to distribute the read load.
High Write Latency
This alert monitors write latency to ensure DocumentDB write operations remain efficient and responsive. It helps detect performance issues caused by high write demand, suboptimal query execution, or resource bottlenecks. An alert triggers if the average write latency exceeds 10 ms for more than 10 minutes. This condition may indicate high write demand, slow disk performance, or insufficient resource allocation. Customization Guidance: - Threshold: The default threshold is set to alert if write latency exceeds 10 ms for 10 minutes. Adjust this threshold based on acceptable performance levels for your application. - Monitoring Period: Modify the monitoring period to reflect write workload patterns. Shorter periods are better for applications requiring low-latency writes. - Notification Frequency: Adjust notification frequency based on how critical low-latency write performance is to your workload. Action: If this alert triggers, evaluate write patterns and optimize queries to reduce write load. Consider increasing storage throughput or scaling instances to handle the demand. Monitor disk performance and address potential bottlenecks.
Integration
Learn more about Coralogix's out-of-the-box integration with Amazon DocumentDB in our documentation.