Ship OpenTelemetry Data to Coralogix via Reverse Proxy (Caddy 2)
It is commonplace for organizations to restrict their IT systems from having direct or unsolicited access to external networks or the Internet, with network proxies serving…
Metricbeat, an Elastic Beat based on the libbeat framework from Elastic, is a lightweight shipper that you can install on your servers to periodically collect metrics from the operating system and from services running on the server. Everything from CPU to memory, Redis to NGINX, etc… Metricbeat takes the metrics and statistics that it collects and ships them to the output that you specify, such as Elasticsearch or Logstash.
In this post, we will cover some of the main use cases Metricbeat supports and we will examine various Metricbeat configuration use cases.
Metricbeat installation instructions can be found on the Elastic website.
To configure Metricbeat, edit the configuration file. The default configuration file is called metricbeat.yml
. The location of the file varies by platform. For rpm and deb, you’ll find the configuration file at this location /etc/metricbeat
. There’s also a full example configuration file at /etc/metricbeat/metricbeat.reference.yml
that shows all non-deprecated options.
The Metricbeat configuration file uses YAML for its syntax as it’s easier to read and write than other common data formats like XML or JSON. The syntax includes dictionaries, an unordered collection of name/value pairs, and also supports lists, numbers, strings, and many other data types.
All members of the same list or dictionary must have the same indentation level. Lists and dictionaries can also be represented in abbreviated form, which is somewhat similar to JSON using {}
for dictionaries and []
for lists. For more info on the config file format.
The Metricbeat configuration file consists, mainly, of the following sections. More information on how to configure Metricbeat can be found here.
There are other sections you may include in your YAML such as a Kibana endpoint, internal queue, etc. You may view them and their different options at the configuring Metricbeat link. Each of the sections has different options and there are numerous module types, processors, different outputs to use, etc…
In this post, I will go over the main sections you may use and focus on giving examples that worked for us here at Coralogix.
The Modules section defines the Metricbeat input, the metrics that will be collected by Metricbeat, each module contains one or multiple metric sets. There are various module types you may use with Metricbeat, you can configure modules in the modules.d
directory (recommended), or in the Metricbeat configuration file. In my examples, I’ll configure the module in the Metricbeat configuration file. Here is an example of the System module. For more info on configuring modules and module types.
#============================= Metricbeat Modules ============================= metricbeat.modules: - module: system metricsets: - cpu # CPU usage - load # CPU load averages - memory # Memory usage - network # Network IO #- core # Per CPU core usage #- diskio # Disk IO #- filesystem # File system usage for each mountpoint #- fsstat # File system summary metrics #- raid # Raid #- socket # Sockets and connection info (linux only) #- service # systemd service information enabled: true period: 10s
There are some more options to this module type as you can observe in the full example config file. These are all the available metric sets with the system module type, the enable parameter is optional, by default if not specified it is set to true and the period parameter is setting how often the metric sets are executed. This setting is required for all module types. You may include multiple module types, all in the same YAML configuration. If you are working with Coralogix and wish to send your metrics to your Coralogix account you will have to include the fields parameter with our required fields, this is beside the fact you need to choose our Logstash endpoint in your config output, we will see a similar example later.
You can use Processors in order to process events before they are sent to the configured output. The libbeat library provides processors for reducing the number of exported fields, performing additional processing and decoding, etc… Each processor receives an event, applies a defined action to the event, and returns the event. If you define a list of processors, they are executed in the order they are defined in the configuration file. This is an example of several processors configured. For more information on filtering and enhancing your data.
# ================================= Processors ================================= # Processors are used to reduce the number of fields in the exported event or to # enhance the event with external metadata. This section defines a list of # processors that are applied one by one and the first one receives the initial # event: # # event -> filter1 -> event1 -> filter2 ->event2 ... # # The supported processors are drop_fields, drop_event, include_fields, # decode_json_fields, and add_cloud_metadata. # # For example, you can use the following processors to keep the fields that # contain CPU load percentages, but remove the fields that contain CPU ticks # values: # processors: - include_fields: fields: ["cpu"] - drop_fields: fields: ["cpu.user", "cpu.system"] # # The following example drops the events that have the HTTP response code 200: # processors: - drop_event: when: equals: http.code: 200 # # The following example renames the field a to b: # processors: - rename: fields: - from: "a" to: "b" # # The following example enriches each event with the machine's local time zone # offset from UTC. # processors: - add_locale: format: offset # # The following example enriches each event with host metadata. # processors: - add_host_metadata: ~ # # The following example decodes fields containing JSON strings # and replaces the strings with valid JSON objects. # processors: - decode_json_fields: fields: ["field1", "field2", ...] process_array: false max_depth: 1 target: "" overwrite_keys: false # # The following example copies the value of message to message_copied # processors: - copy_fields: fields: - from: message to: message_copied fail_on_error: true ignore_missing: false # # The following example preserves the raw message under event_original, which then cut at 1024 bytes # processors: - copy_fields: fields: - from: message to: event_original fail_on_error: false ignore_missing: true - truncate_fields: fields: - event_original max_bytes: 1024 fail_on_error: false ignore_missing: true # # The following example URL-decodes the value of field1 to field2 # processors: - urldecode: fields: - from: "field1" to: "field2" ignore_missing: false fail_on_error: true # # The following example is a great method to enable sampling in Metricbeat, using Script processor # processors: - script: lang: javascript id: my_filter source: > function process(event) { if (Math.floor(Math.random() * 100) < 50) { event.Cancel(); } }
Metricbeat offers more types of processors as you can see here and you may also include conditions in your processor definition. If you use Coralogix, you have an alternative to Metricbeat Processors, to some extent, as you can set different kinds of parsing rules through the Coralogix UI instead. If you are maintaining your own ELK stack or other 3rd party logging tool you should check for processors when you have any need for parsing.
You configure Metricbeat to write to a specific output by setting options in the Outputs section of the metricbeat.yml
config file. Only a single output may be defined. In this example, I am using the Logstash output. This is the required option if you wish to send your logs to your Coralogix account, using Metricbeat. For more output options.
# ================================= Logstash Output ================================= output.logstash: # Boolean flag to enable or disable the output module. enabled: true # The Logstash hosts hosts: ["localhost:5044"] # Configure escaping HTML symbols in strings. escape_html: true # Number of workers per Logstash host. worker: 1 # Optionally load-balance events between Logstash hosts. Default is false. loadbalance: false # The maximum number of seconds to wait before attempting to connect to # Logstash after a network error. The default is 60s. backoff.max: 60s # Optional index name. The default index name is set to filebeat # in all lowercase. index: 'filebeat' # The number of times to retry publishing an event after a publishing failure. # After the specified number of retries, the events are typically dropped. # Some Beats, such as Filebeat and Winlogbeat, ignore the max_retries setting # and retry until all events are published. Set max_retries to a value less # than 0 to retry until all events are published. The default is 3. max_retries: 3 # The maximum number of events to bulk in a single Logstash request. The # default is 2048. bulk_max_size: 2048 # The number of seconds to wait for responses from the Logstash server before # timing out. The default is 30s. timeout: 30s
This example only shows some of the configuration options for the Logstash output, there are more. It’s important to note that when using Coralogix, you specify the following Logstash host: logstashserver.coralogix.com:5044
under hosts and that some other options are redundant, such as index name, as it is defined by us.
At this point, we have enough Metricbeat knowledge to start exploring some actual configuration files. They are commented and you can use them as references to get additional information about different plugins and parameters or to learn more about Metricbeat.
This example uses the system module to monitor your local server and send different metric sets. The Processors section includes a processor to drop unneeded beat metadata. The chosen output for this example is stdout.
#============================= Metricbeat Modules ============================= metricbeat.modules: - module: system metricsets: - cpu # CPU usage - load # CPU load averages - memory # Memory usage - network # Network IO enabled: true period: 10s # ================================= Processors ================================= processors: - drop_fields: fields: ["agent", "ecs", "host"] ignore_missing: true #================================= Console Output ================================= output.console: pretty: true
This example uses the system module to monitor your local server and send different metric sets, forwarding the events to Coralogix’s Logstash server (output) with the secured connection option. The Processors section includes a processor to sample the events to send 50% of the data.
#============================= Metricbeat Modules ============================= metricbeat.modules: - module: system metricsets: - cpu # CPU usage - load # CPU load averages - memory # Memory usage - network # Network IO enabled: true period: 10s #============================= General ============================= fields_under_root: true fields: PRIVATE_KEY: "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" COMPANY_ID: XXXXX APP_NAME: "metricbeat" SUB_SYSTEM: "system" #================================= Processors ================================= processors: # The following example is a great method to enable sampling in Filebeat, using Script processor. This script processor drops 50% of the incoming events - script: lang: javascript id: my_filter source: > function process(event) { if (Math.floor(Math.random() * 100) < 50) { event.Cancel(); } } #================================= Logstash output ================================= output.logstash: enabled: true hosts: ["logstashserver.coralogix.com:5015"] ssl.certificate_authorities: ["/var/log/metricbeat/ca.crt"]