Real-time AI observability is here - introducing Coralogix's AI Center

Learn more

Quick Start Observability for Azure Cosmos DB Observability

thank you

Thank you!

We got your information.

Azure Cosmos DB Observability
Azure Cosmos DB Observability icon

Coralogix Extension For Azure Cosmos DB Observability Includes:

Dashboards - 1

Gain instantaneous visualization of all your Azure Cosmos DB Observability data.

Azure Cosmos DB
Azure Cosmos DB

Alerts - 4

Stay on top of Azure Cosmos DB Observability key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.

Azure Cosmos DB - Availability per account <100%

This alert monitors the minimum service availability of Azure Cosmos DB database accounts. The alert is triggered when the minimum service availability falls below a predefined threshold for a continuous period. Customization Guidance: Threshold: The default threshold is set at a certain value. Depending on your specific application needs and service level agreements (SLAs), this threshold may require adjustment. Lower thresholds can help detect availability issues earlier and ensure compliance with SLAs. Monitoring Period: A monitoring period of 5 minutes helps filter out transient fluctuations and focus on sustained availability issues. Adjust this period based on the criticality of the services and the operational dynamics of your system. Action: Upon triggering this alert, immediate actions include investigating the root cause of the availability degradation, such as network issues, database configuration changes, or resource constraints. Consider implementing automated failover mechanisms or scaling resources to maintain service availability during peak usage periods or in the event of infrastructure failures.

Azure Cosmos DB - Account Deletinon

This alert has detected an increase in the number of Azure Cosmos DB database account deletions. The alert is triggered when the number of account deletions exceeds a predefined threshold for a continuous period. Customisation Guidance: Threshold: The default threshold is set at a certain value. Depending on your specific application needs and past performance data, this threshold may require adjustment. Lower thresholds can help detect issues earlier, especially for critical applications. Monitoring Period: A monitoring period of 10 minutes helps filter out transient spikes and focus on sustained issues. Adjust this period based on the criticality of the operations and the dynamics of your system. Action: Upon triggering this alert, immediate actions include reviewing recent Azure Cosmos DB metrics and logs to identify the cause of the deletions. Consider implementing safeguards or access controls to prevent unauthorised account deletions and regularly backing up critical data to mitigate potential data loss.

Azure Cosmos DB - Hight total 4xx

This alert has detected an increase in the number of requests resulting in client errors (4xx status codes) for Azure Cosmos DB accounts. The alert is activated when the number of such requests exceeds the threshold for a continuous period of 5 minutes. Customisation Guidance: Threshold: The default threshold is set at 0. Depending on your specific application needs and past performance data, this threshold may need adjustment. For applications with high transaction rates, a lower threshold might be necessary to detect issues earlier. Adjust the threshold in the PromQL query accordingly. Monitoring Period: A monitoring period of 5 minutes helps filter out transient spikes and focus on sustained issues with the requests. Depending on the criticality of the operations and the dynamics of your system, you might need to adjust this period. Ensure that any changes are reflected in the PromQL query. Specificity: The alert can be tailored for different Azure Cosmos DB accounts based on their roles within your infrastructure. Accounts handling critical operations may require more stringent monitoring compared to others. Notification Frequency: To balance responsiveness and noise, consider the frequency of this alert. Adjustments may be needed based on the criticality of the applications supported by the Azure Cosmos DB accounts. Action: Upon triggering of this alert, immediate actions should include: - Reviewing recent request patterns and logs for the Azure Cosmos DB. - Checking for recent configuration changes or deployments that might have affected the request handling. - Investigating any potential misconfigurations or performance bottlenecks within the Azure Cosmos DB setup. - Considering automated mitigation strategies such as adjusting request limits, optimising queries, or scaling the database resources as per the service capacity requirements. By closely monitoring and adjusting the parameters of this alert, you can ensure it accurately reflects the health and performance of your Azure Cosmos DB accounts, thereby maintaining the reliability of your applications.

Azure Cosmos DB - Hight total 5xx

This alert has detected an increase in the number of requests resulting in server errors (HTTP status codes 5xx) to Azure Cosmos DB database accounts. The alert is triggered when the number of server error requests exceeds a predefined threshold for a continuous period. Customisation Guidance: Threshold: The default threshold is set at a certain value. Depending on your specific application needs and past performance data, this threshold may require adjustment. Lower thresholds can help detect issues earlier, especially for critical applications. Monitoring Period: A monitoring period of 5 minutes helps filter out transient spikes and focus on sustained issues. Adjust this period based on the criticality of the operations and the dynamics of your system. Error Code Specificity: Tailor alerts for different HTTP status codes based on their impact. For example, certain server errors like 503 Service Unavailable might need a different response compared to 500 Internal Server Error. Notification Frequency: Balance the frequency of notifications to optimise responsiveness and reduce noise. Adjust according to the criticality of the applications supported. Action: Upon triggering this alert, immediate actions include reviewing recent Azure Cosmos DB metrics and logs, investigating any recent changes to the database configuration, and checking for potential performance bottlenecks or misconfigurations. Consider automated mitigation strategies such as scaling resources or implementing retry mechanisms to handle transient errors effectively.

Integration

Learn more about Coralogix's out-of-the-box integration with Azure Cosmos DB Observability in our documentation.

Read More
Schedule Demo

Enterprise-Grade Solution