Quick Start Observability for Google Dataflow
Thank you!
We got your information.
Coralogix Extension For Google Dataflow Includes:
Dashboards - 1
Gain instantaneous visualization of all your Google Dataflow data.
Alerts - 2
Stay on top of Google Dataflow key performance metrics. Keep everyone in the know with integration with Slack, PagerDuty and more.
Job Failure
This alert has detected a job failure in a Google Dataflow job. The alert is activated when any job failure is detected, i.e., when the job failure ratio is greater than or equal to 1 for a continuous period of 10 minutes. Customization Guidance: Threshold: The default threshold is set at a job failure ratio of 1. Depending on your specific application needs and past performance data, this threshold may need adjustment. Critical jobs with higher reliability requirements may need immediate alerts on any failure. Cloud Provider Specificity: Tailor alerts for different cloud providers based on their roles and criticality in your infrastructure. Critical jobs running on specific cloud providers may warrant more stringent monitoring. Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the jobs supported by Dataflow. Action: Upon triggering of this alert, immediate actions include reviewing recent job failure logs, checking for configuration issues, and ensuring that jobs are optimized for success. Mitigating issues may involve debugging job configurations or optimizing the infrastructure.
Low Disk Space Capacity
This alert has detected low disk space capacity in a Google Dataflow job. The alert is activated when the remaining disk space capacity is less than 1 GB for a continuous period of 10 minutes. Customization Guidance: Threshold: The default threshold is set at 1 GB of remaining disk space. Depending on your specific application needs and past performance data, this threshold may need adjustment. Critical jobs with high disk space requirements may need a higher threshold. Job Specificity: Tailor alerts for different jobs based on their roles and criticality in your infrastructure. Critical jobs handling large data processing tasks may warrant more stringent monitoring. Notification Frequency: Consider the frequency of this alert to optimize the balance between responsiveness and noise. Adjust according to the criticality of the jobs supported by Dataflow. Action: Upon triggering of this alert, immediate actions include reviewing recent disk space usage logs, checking for data processing issues, and ensuring that jobs are configured to handle disk space efficiently. Mitigating issues may involve optimizing job configurations or scaling disk resources.
Integration
Learn more about Coralogix's out-of-the-box integration with Google Dataflow in our documentation.