Request Demo

A practical guide to Logstash

Logstash is a tool to collect, process, and forward events and log messages and this Logstash tutorial will get you started quickly. It was created by Jordan Sissel who, with a background in operations and system administration, found himself constantly managing huge volumes of log data that really needed a centralized system to aggregate and manage them. Logstash was born under this premise and in 2013 Sissel teamed up with Elasticsearch.

The collection is accomplished via configurable input plugins including raw socket/packet communication, file tailing, and several message bus clients. Once an input plugin has collected data it can be processed by any number of filter plugins that modify and annotate the event data. Finally, Logstash routes events to output plugins that can forward the data to a variety of external programs including Elasticsearch, local files, and several message bus implementations.

 

Logstash Configuration File

The Logstash configuration file specifies which plugins are to be used and how. You can reference event fields in a configuration and use conditionals to process events when they meet certain criteria. When running Logstash, you use -f to specify your config file.

The configuration file has a section for each type of plugin you want to add to the event processing pipeline:

input {
  ...
 }
filter {
  ...
 }
output {
  ...
}

Multiple filters can be applied in the order of their appearance in the configuration file and within each section, we list the configuration options for the specific plugin.

Settings vary according to the individual plugin. A plugin can require that a value for a setting be of a certain type. The following are the supported types.

Inputs and outputs support codec plugins that enable you to encode or decode the data as it enters or exits the pipeline without having to use a separate filter.

The Coralogix Logstash output plugin

Most of our examples will use Coralogix Logstash output plugin. The plugin configuration has the following structure:

 

coralogix {
config_params => { 
  “PRIVATE_KEY” => “my_private_key” 
  “APP_NAME” => “my_app” 
  “SUB_SYSTEM” => “my_subsystem”
  }
 is_json => true
}

The config_params section with its env variables is mandatory.

Examples

The reminder of this document will focus on providing examples of Logstash working configurations.

Example 1

#This implementation uses the beats input plugin. It will listen on port 5044 for beats 
#traffic 
input {
   beats {
     port => 5044
   }
}
#This implementation sends logs to Coralogix via the Coralogix output plugin. In this 
#example %{[@metadata][beat]} sets the subsystem name to the value of the beat 
#metadata field, (i.e. ‘metricbeat’ or similar)
output {
   coralogix {
     config_params => {
         “PRIVATE_KEY” => “${PRIVATE_KEY}” 
         “APP_NAME” => “my_app” 
         “SUB_SYSTEM” => “%{[@metadata][beat]}”
     }
       is_json => true
   }
}

Example 2

# This tcp input plugin listens to port 6000. It uses the json_line codec to identify jsons  
# in a stream of jsons separated by newlines
input{
   tcp{
    port => 6000  1 
    codec => json_lines
   }
}
#Coralogix output plugin.
output {
   coralogix {
      config_params => {
         “PRIVATE_KEY” => “YOUR_PRIVATE_KEY” 
         “APP_NAME” => “APP_NAME” 
         “SUB_SYSTEM” => “SUB_NAME”
      }
      is_json => true
  }
}

Example 3

#This example uses the jdbc input plugin. It can ingest data from any DB with a JDBC interface. 
#The plugin doesn’t come with drivers, hence the driver_classs and driver_library configuration #options.
input {
  jdbc {
    jdbc_driver_library => "c:/logstash/mssql-jdbc-7.2.2.jre8.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "jdbc:sqlserver://172.29.85.103:1433;databaseName=AVALogger;"
    jdbc_user => "avalogger"
    jdbc_password => "ZAQ!2wsx"
    connection_retry_attempts => 5
    statement => "SELECT * FROM AVALogger.dbo.parsed_data WHERE id > :sql_last_value"
    use_column_value => true
    tracking_column => "id"
    schedule => "* * * * *"
    last_run_metadata_path => "c:/logstash/logstash_jdbc_last_run"
  }
}
#The json filter plugin takes an existing field which contains JSON and expands it into an 
#actual data structure within the Logstash event. In this case it will take the content of 
#‘extra_data’, an original DB column, skip_on_invalid_json allows the filter to skip 
#non-json or non-valid json field values without warnings or added logic.
filter {
   json {
    source => "extra_data"
    target => "extra_data"
    skip_on_invalid_json => true
   }
}
#Coralogix output plugin.
output {
  coralogix {
    config_params => {
      "PRIVATE_KEY" => "504033e3-1474-d733-9b9c-c9531105be40"
      "APP_NAME" => "AvaTrade"
      "SUB_SYSTEM" => "RDS"
    }
    is_json => true
  }
}

Example 4

#In this example the File input plugin is being used. It Stream events from files, normally by 
#tailing them (this is configurable). All logs will get A field called stype’ added to them 
#with the value ‘production-log’. ‘path’ indicates where to read the logs from and the json 
#codec. The codec decodes (via inputs) and encodes (via outputs) full JSON messages.
input {
  file {
    type    => "production-log"
    path    => "/home/admin/apps/fiverr/current/log/production.log"
    codec   => "json"
  }
}
#This output will send logs to a Redis queue using the Redis output plugin. It is using an ‘if” 
#statement to direct only logos with type:”production-log" to the output.
output {
  if [type] == "production-log" {
    redis {
        host => "192.168.0.2"
        port => 6379
        data_type => "list"
        key => "logstash-production-log"
        codec   => "json_lines"
    }
  }
}

Example 5

#Like in example 4, the File input plugin is being used. A ‘type’ field is added to each of the 
#processed events. The "plain" codec is for plain text with no delimiting between events. 
#‘start_position => “beginning”’ means that a file will be read starting from the beginning and not 
#from he end (which is the default). 
input {
 file {
   type => "jtracker"
   codec => "plain"
   path => "/app/logs/trk.log"
   start_position => "beginning"
 }
}
#The json filter plugin takes an existing field which contains JSON and expands it into an 
#actual data structure within the Logstash event. In this case it will take the content of 
#message field and structure it into the same field. skip_on_invalid_json allows the filter 
#to skip non-json or non-valid json field values without warnings or added logic.
filter {
   json {
    source => "message"
    target => "message"
    skip_on_invalid_json => true
   }
}
#This configuration uses two output plugins in parallel. Elasticsearch and Coralogix. 
#Elasticsearch target index is built using “vm-jtracker” and using Logstash sprintf format the 
#date is added to the string based on @timestamp. In this example Coralogix output plugin 
#is configured to ship th logs through proxy. Username and pwd are optional parameters.   
output {
  elasticsearch {    
    hosts => ["172.27.247.108:9200"]
    index => "vm-jtracker-%{+YYYY.MM.dd}"
  }
  coralogix_logger { 
    config_params => {
        "PRIVATE_KEY" => "902115ae-c512-edaf-d015-743d0da29cb6"
        "APP_NAME" => "Production"
        "SUB_SYSTEM" => "jtracker"
    }
    proxy => {
            host => "48.56.67.73"
            port => "1982"
    } 

    timestamp_key_name => "@timestamp"     
    log_key_name => "message"     
    is_json => true   
   } 
}

Example 6

#Like previous examples, the File input plugin is being used. This time with the ‘exclude’ 
#parameters that indicates which files to ignore as input. The multiline codec collapses 
#multiline messages and merges them into a single event. In this example it will start a 
#new event every time it recognizes a string of word characters that ends with 4 digits, followed 
#by what looks like a time stamp in the form of tt:tt:tt.mmmmmm. This is the regex associated 
#with the ‘pattern’. Negate “true” means that a message not matching the pattern will 
#constitute a match of the multiline filter and the what config parameter will 
#be applied #and indicate the relation to the multi-line event.
input {
  file {
    path => "/mnt/data/logs/pipeline/*.1"
    exclude => "*.gz"
    codec => multiline {
      pattern => "\w+\d{4} \d{2}\:\d{2}\:\d{2}\.\d{6}"
      negate => true
      what => previous
    }
  }
}
#The grok filter plugin parses arbitrary text and structures it. In this example it will parse #the event message field into additional log fields designated by the regex named #groups. It will get the rest of the log into a field named log and will than remove the #original message field.
filter {
   grok {
    match => { "message" => "(?<loglevel>[A-Z]{1})(?<time>%{MONTHNUM}%{MONTHDAY} %{TIME}) %{POSINT:process}-%{POSINT:thread} %{DATA:function}:%{POSINT:line}] %{GREEDYDATA:log}" }
    remove_field => [ "message" ]
   }
# Next in line to process the event is the json filter plugin like in example 3
   json {
    source => "log"
    target => "log"
    skip_on_invalid_json => true
   }
#The mutate filter plugin is used to rename, remove, replace, and modify fields in your events. #The order of mutations is kept by using different blocks.  
   mutate {
#This section creates a parent field called message for all these different fields.
    rename => {
      "loglevel" => "[message][loglevel]"
      "process" => "[message][process]"
      "thread" => "[message][thread]"
      "function" => "[message][function]"
      "line" => "[message][line]"
      "log" => "[message][log]"
      "message" => "[message][message]"
     }
#Copies source =>destination
     copy => {
      "time" => "[message][message][time]"
      "path" => "[message][log_path]"
     }
   }
  mutate {
#Converts the field type
    convert => {
      "[message][message][process]" => "integer"
      "[message][message][thread]" => "integer"
      "[message][message][line]" => "integer"
    }
  }
#The truncate filter plugin allows you to truncate fields longer than a given length.
  truncate {
    fields => [ "time" ]
    length_bytes => 17
  }
#The date filter plugin is used for parsing dates from fields, and then using that date or #timestamp as the logstash timestamp for the event. In this case there is only one #format to look for in the field time. It will update the default @timestamp field for the #event and then remove the field time from the event.
  date {
    match => [ "time", "MMdd HH:mm:ss.SSS" ]
    remove_field => [ "time" ]
  }
}
output {
  coralogix {
    config_params => {
      "PRIVATE_KEY" => "a298b4f5-7d13-4e9a-5b52-9da5ca45ac05"
      "APP_NAME" => "pipeline"
      "SUB_SYSTEM" => "aus"
    }
    log_key_name => "message"
    timestamp_key_name => "@timestamp"
    is_json => true
  }
}

Example 7

#Like previous examples, the File input plugin is being used. Sincedb_path holds the path to the 
#file that holds the current position of the monitored log files. Read mode means that the files 
#will be treated as if they are content complete. Logstash will look for EOF and then emit the 
#accumulated characters as a line. This helps with processing zip’ed files. discover_interval 
#sets the frequency the plugin will use the regular expression to look for new files. Stat_interval 
#sets the frequency we check if files are modified. File_completed_action = log and 
#file_completed_path  combined will append the read file upon completion to the file species in 
#file_completed_path. File_chunk_size set the block size to be read from the file. In this 
#configuration we specified 4x the default 32KB chunk. The json codec decodes (via inputs) 
#and encodes (via outputs) full JSON messages.
input {
  file {
    path => [
      "/var/log/genie/*.gz",
      "/var/log/genie/*.gzip"
    ]
    sincedb_path => "/dev/null"
    mode => read
    codec => "json"
    discover_interval => 10
    stat_interval => "500ms"
    file_completed_action => log
    file_completed_log_path => "/dev/null"
    file_chunk_size => 131072
  }
}
output {
  coralogix {
    config_params => {
      "PRIVATE_KEY" => ""
      "APP_NAME" => "Cobalo"
      "SUB_SYSTEM" => "test"
    }
    is_json => true
  }
}

Start solving your production issues faster

Let's talk about how Coralogix can help you

Managed, scaled, and compliant monitoring, built for CI/CD

Get a demo

No credit card required

Get a personalized demo

Jump on a call with one of our experts and get a live personalized demonstration