Troubleshooting
Limits & quotas
- Coralogix places a hard limit of 10MB of data to our OpenTelemetry Endpoints, with a recommendation of 2MB.
- Metric names must be a maximum of 255 characters.
- Attribute keys for metric data must be a maximum of 255 characters.
Metrics
You can enhance metric telemetry collection using the level field. The following is a list of all possible values and their explanations:
- "none" indicates that no telemetry data should be collected
- "basic" is recommended and covers the basics of service telemetry
- "normal" adds additional indicators on top of the basic level
- "detailed" adds dimensions and views to the previous levels
For example:
This adds more metrics around exporter latency and various processor metrics.
Prometheus Receiver
If you are missing metrics collected by the Prometheus receiver, make sure to check Collector logs.
The Prometheus receiver typically logs Failed to scrape Prometheus endpoint errors with target information when it fails to collect the application metrics.
For example:
message_obj:{
level:warn
ts:2024-12-13T08:19:17.809Z
caller:internal/transaction.go:129
msg:Failed to scrape Prometheus endpoint
kind:receiver
name:prometheus
data_type:metrics
scrape_timestamp:1734077957789
target_labels:{__name__="up", container="main", endpoint="4001", namespace="namespace", pod="pod-name"}
}
The generic error doesn't tell you much. To get more details, you will need to enable debug logs inside the Collector:
Then you will start seeing the actual metrics and errors in Collector logs, which will help you troubleshoot the issue further.
Common errors
invalid sample: non-unique label names - Metric contains non-unique label names. For example:
This is not allowed in Prometheus / OpenMetrics, but some libraries produce such labels. It is best to fix the application or library. But as a workaround, you can fix it with metric_relabel_configs, which gets executed before the metric is ingested.
For example, you drop the label1 metric:
Alternatively, you can replace the metric with itself, leaving only single label:
'le' label on histogram metric is missing or empty. Histogram metric contains multiple types. Typically, the metric library produces invalid metrics that are both a histogram and a summary, which is not allowed in Prometheus / OpenMetrics. For example:
# HELP http_server_requests_seconds
# TYPE http_server_requests_seconds histogram
http_server_requests_seconds_bucket{le="0.025",} 1
http_server_requests_seconds_count{} 15.0
http_server_requests_seconds_sum{} 0.20938292
...
http_server_requests_seconds{quantile="0.999",} 0.0
It is best to fix the application or library to produce just histogram. But as a workaround, you can fix it with metric_relabel_config. The following example will drop metrics with quantile label:
metric_relabel_configs
- sourceLabels: [__name__, quantile]
regex: http_server_requests_seconds;.*
action: drop
Traces
OpenTelemetry Collector has an ability to send it's own traces using OTLP exporter. You can send the traces to OTLP server running on the same OpenTelemetry Collector, so it goes through configured pipelines. For example:
service:
telemetry:
traces:
processors:
batch:
exporter:
otlp:
protocol: grpc/protobuf
endpoint: ${env:MY_POD_IP}:4317