Our next-gen architecture is built to help you make sense of your ever-growing data Watch a 4-min demo video!

Back to All Docs

Tail Sampling with Coralogix and OpenTelemetry Tail Sampling with Coralogix and OpenTelemetry

Last Updated: Jan. 18, 2023

Are you looking for a way to improve traces observability without breaking the bank? Look no further!

In our previous tutorial, we showed you how to set up the new OpenTelemetry Community Demo Application and send telemetry data to Coralogix, giving you the ability to understand the interactions between your services and visualize, alert and query them on your Coralogix dashboard.

But what about cost? Ingesting all of your traces and spans can quickly add up and can be unnecessary in order to gain visibility into the health of your applications. That’s where trace sampling comes in – in particular, tail sampling.

By sampling your traces, you can significantly reduce the amount of data ingested into Coralogix, maintaining full visibility into your services without incurring heavy charges. Try it out with the OTel Demo App in this tutorial.

Intro

The following tutorial will demonstrate how to use the OTel Collector in a load balanced configuration with tail sampling enabled on the collector nodes, using the OTel Demo App.

The tail sampling processor and probabilistic sampling processor allow you to sample traces based on a set of rules at the collector level.

This allows you to define more advanced rules to keep accrued visibility over error or high latency traces.

Note: To achieve this in your environment your code should be instrumented with OpenTelemetry and emit the telemetry data to the OTel Collector.

What is Tail Sampling, and Why is it Important?

Tail sampling is a method of trace sampling in which sampling decisions are made at the end of the workflow, allowing for a more accurate sampling decision. This is in contrast to head-based sampling, in which the the sampling decision is made at the beginning of a request and usually at random. Tail sampling grants you the option of filtering your traces based on specific criteria, a plus when compared with head-based sampling.

So why is tail sampling important, and why should you do it?

  • Enjoy focused observability. Tail sampling is a powerful tool for focused observability, allowing you to zero in on the traces that matter to you most. View only those traces that are of interest to you.
  • Identify issues. Tail sampling is useful for identifying issues in your distributed system while saving on observability costs.
  • Save on costs. By selectively exporting a predetermined subset of your traces, you can lower data ingestion and storage costs, while still being able to identify and troubleshoot issues.

Tutorial: Implement Tail Sampling in the OTel Collector

Follow the steps below to use the OTel Collector in a load balanced configuration with tail sampling enabled on the collector nodes, using the OTel Demo.

Prerequisites

Step-by-Step Guide

STEP 1 – Clone the opentelemetry-demo.git to your local machine

git clone https://github.com/open-telemetry/opentelemetry-demo.git

STEP 2 – Edit the directory

  • Change the directory and edit the .env file. Add the following lines to the bottom and update the values to reflect your Coralogix account environment variables, as in the example below:
    • CORALOGIX_DOMAIN: You’ll will need to include your account’s specific domain in the Coralogix endpoint.
    • CORALOGIX_PRIVATE_KEY: Insert your Coralogix private key. Bear in mind that this information is sensitive and be kept secure.
    • CORALOGIX_APP_NAME & CORALOGIX_SUBSYS_NAME: Customize and organize your data in your Coralogix dashboard using application and subsystem names.
# ********************
# Exporter Configuration
# ********************
CORALOGIX_DOMAIN=coralogix.com
CORALOGIX_APP_NAME=otel
CORALOGIX_SUBSYS_NAME=otel-demo
CORALOGIX_PRIVATE_KEY=b3887c90-5e67-4249-e81b-EXAMPLEKE

STEP 3 – Configure the load balancing exporter

  • Configure a load balancing exporter:
    • In the src/otelcollector/otelcol-config-extras.yml file, paste the following:
exporters:
  loadbalancing:
    protocol:
      otlp:
        # TLS configuration between the exporter LB and the exporters can be encrypted - for this environment we are instructing the exporter LB to ignore TLS errors
        tls:
          insecure: true
        # all options from the OTLP exporter are supported
        # except the endpoint
        timeout: 1s
    resolver:
      static:
        # These would be the corresponding docker hosts running the exporters
        hostnames:
        - otel-col-shipper:4317
        - otel-col-shipper-2:4317
    
processors:
  batch/traces:
    timeout: 1s
    send_batch_size: 50
  resourcedetection:
    detectors: [system, env]
    timeout: 5s
    override: true

service:
  pipelines:
    traces:
      processors: [ batch/traces, batch]
      exporters: [logging,  otlp, loadbalancing]

Wondering what we just did?

  • The above configuration gets merged into the main OTel configuration file and does the following:
    • Defines the otlp protocol to be used to communicate with the onward Collectors (we disable TLS for this connection)
    • Defines 2 target collectors to be used: otel-col-shipper & otel-col-shipper-2 .
    • Configures some batch settings for shipping the traces
    • Defines the traces pipeline to forward the traces onward

STEP 4 – Create a config file to be used by both of the load balanced OTel collectors

  • Create a new file called otel-standalone.yml in the root of the opentelemetry-demo folder and paste the following configuration:
extensions:
  health_check:
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  batch/traces:
    timeout: 1s
    send_batch_size: 50
  resourcedetection:
    detectors: [system, env]
    timeout: 5s
    override: true
  tail_sampling:
    decision_wait: 10s 
    num_traces: 100
    expected_new_traces_per_sec: 10
    policies:
      [          
        {
          name: errors-policy,
          type: status_code,
          status_code: {status_codes: [ERROR]}
        },
        {
          name: randomized-policy,
          type: probabilistic,
          probabilistic: {sampling_percentage: 25}
        },
      ]
  attributes/shipper:
    actions:
      - key: shipper
        action: insert
        value: '${SHIPPER_NAME}'

exporters:
  logging:
  coralogix:
    # The Coralogix traces ingress endpoint
    traces:
      endpoint: "otel-traces.${CORALOGIX_DOMAIN}:443"
    metrics:
      endpoint: "otel-metrics.${CORALOGIX_DOMAIN}:443"
    logs:
      endpoint: "otel-logs.${CORALOGIX_DOMAIN}:443"

    # Your Coralogix private key is sensitive
    private_key: "${CORALOGIX_PRIVATE_KEY}"

    application_name: "${CORALOGIX_APP_NAME}"
    subsystem_name: "${CORALOGIX_SUBSYS_NAME}"

    # (Optional) Timeout is the timeout for every attempt to send data to the backend.
    timeout: 30s

service:
  extensions: [health_check]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [attributes/shipper, tail_sampling, batch/traces, resourcedetection]
      exporters: [coralogix, logging]
    logs:
      receivers: [otlp]
      exporters: [coralogix, logging]

Wondering what we just did?

  • This config file does the following:
    • Sets up a OTLP listener (The loadbalancer collector will ship the traces to this port on each collector.)
    • Configures some batch settings for shipping the traces
    • Configures a tail_sampling process with the following 2 policies:
      • Ship 25% of traces that do not contain errors
      • Ship all traces that contain errors
    • Configures an attributes process to inject an environment variable as a new field into each span. This allow us to identify which load balanced shipper sent a span to Coralogix.
    • Configures the Coralogix exporter. We use environment variables from the .env file to inject the required values
    • Defines the outgoing exporter pipelines

STEP 5 – Update the OTel Collector container definition

  • Open docker-compose.ymlfile and update the otelcol container definition (line 603) to match the environment variables that we defined earlier. Add the environment section to file, as done in the example below.
# OpenTelemetry Collector - Load Balancer
  otelcol:
    image: otel/opentelemetry-collector-contrib:0.68.0
    container_name: otel-col
    deploy:
      resources:
        limits:
          memory: 100M
    restart: unless-stopped
    command: [ "--config=/etc/otelcol-config.yml", "--config=/etc/otelcol-config-extras.yml" ]
    volumes:
      - ./src/otelcollector/otelcol-config.yml:/etc/otelcol-config.yml
      - ./src/otelcollector/otelcol-config-extras.yml:/etc/otelcol-config-extras.yml
    ports:
      - "4317"          # OTLP over gRPC receiver
      - "4318:4318"     # OTLP over HTTP receiver
      - "9464"          # Prometheus exporter
      - "8888"          # metrics endpoint
    depends_on:
      - jaeger
      - otelcol_shipper
      - otelcol_shipper_2
    logging: *logging

  # OpenTelemetry Collector - Shipper 1
  otelcol_shipper:
    image: otel/opentelemetry-collector-contrib:0.68.0
    container_name: otel-col-shipper
    deploy:
      resources:
        limits:
          memory: 100M
    restart: unless-stopped
    command: [ "--config=/etc/otelcol-config.yml" ]
    volumes:
      - ./otel-standalone.yml:/etc/otelcol-config.yml
    ports:
      - "4317"          # OTLP over gRPC receiver
      - "4318"          # OTLP over HTTP receiver
      - "9464"          # Prometheus exporter
      - "8888"          # metrics endpoint
    environment:
      - CORALOGIX_DOMAIN
      - CORALOGIX_APP_NAME
      - CORALOGIX_SUBSYS_NAME
      - CORALOGIX_PRIVATE_KEY
      - SHIPPER_NAME=shipper-1
    logging: *logging

# OpenTelemetry Collector - Shipper 2
  otelcol_shipper_2:
    image: otel/opentelemetry-collector-contrib:0.68.0
    container_name: otel-col-shipper-2
    deploy:
      resources:
        limits:
          memory: 100M
    restart: unless-stopped
    command: [ "--config=/etc/otelcol-config.yml" ]
    volumes:
      - ./otel-standalone.yml:/etc/otelcol-config.yml
    ports:
      - "4317"          # OTLP over gRPC receiver
      - "4318"          # OTLP over HTTP receiver
      - "9464"          # Prometheus exporter
      - "8888"          # metrics endpoint
    environment:
      - CORALOGIX_DOMAIN
      - CORALOGIX_APP_NAME
      - CORALOGIX_SUBSYS_NAME
      - CORALOGIX_PRIVATE_KEY
      - SHIPPER_NAME=shipper-2
    logging: *logging

Wondering what we just did?

  • This file does the following:
    • Updates the OTel Collector to use the latest tagged release
    • Configures the loadbalancer to depends_on on our 2 new OTel Collector instances
    • Defines 2 new OTel Collector containers, with each container using a new standalone OTel configuration (otel-standalone.yml), allowing them to both ship to Coralogix
    • Passes the variables defined in .env in STEP 2 into the containers as environment variables to be used in the OTel configuration file
    • Defines 2 new environment variables SHIPPER_NAME that will be injected into each span to identify which collector shipped the span to Coralogix

STEP 6 – Build the Environment

  • To run the environment, use the following command:
docker compose up --no-build
  • If you are using an M1/M2 Mac, you’ll need to build the containers to ensure that they’re using the right architecture. This will take significantly longer (upwards of 20 minutes).
docker compose up --build
  • Here is what your Docker desktop dashboard should look like:

STEP 7 – Validation & Testing

  • The following 3 commands will print the logs from each of the OTel collectors:
docker logs -f otel-col
docker logs -f otel-col-shipper
docker logs -f otel-col-shipper-2
  • Once you have brought the environment online, let it run for a few minutes and then check each of the containers. You should see output similar to the below:

The last 3 lines indicate that the collector is shipping spans.

  • For further validation, generate trace errors by navigating to the URL http://localhost:8080/feature/featureflags and enabling the productCatalogFailure feature flag. This will instruct the environment to create errors, allowing our OTel collectors to ship them!

STEP 8 – Finally… view your traces on your Coralogix dashboard!

  • View the traces generated by the project on Coralogix dashboard. Click Explore > Tracing
  • Select ADD FILTER in the top left corner of your screen and select shipper. This will allow you to filter the traces on the basis of the load balanced collector sending the the traces.
  • Select the relevant shipper or shippers.
  • Select one of the traces and view the shipper name in the traces info table.

OTel and Coralogix

The OpenTelemetry Community Demo application is an awesome tool for getting to know about OpenTelemetry and instrumentation best practices. In this tutorial, we showed you how to use the OTel Demo to use the OTel Collector in a load-balanced configuration with tail sampling enabled on collector nodes. Use this as inspiration for implementing tail sampling in your application traces.

Interested in learning more?

  • Review the OTel documentation on the Coralogix site
  • Check out our awesome, instructional OTel videos:
    • Here’s a video on how to integrate traces into Coralogix using OpenTelemetry, Kubernetes & Helm
    • Here’s another on capturing Kubernetes logs, transform with Logs2Metrics and render with DataMap
  • Contact us! Our world-class customer success team is available 24/7 to walk you through your setup and answer any questions that may come up. Feel free to reach out to us via our in-app chat or by sending us an email at [email protected].

On this page