Syslog 101: Everything You Need to Know to Get Started

Syslog takes its name from the System Logging Protocol. It is a standard for message logging monitoring and has been in use for decades to send system logs or event messages to a specific server, called a Syslog Server.

Syslog Components

To achieve the objective of offering a central repository for logs from multiple sources, Syslog servers have several components including:

  • Syslog Listener: This gathers and processes Syslog data sent over UDP port 514.
  • Database: Syslog servers need databases to store the massive amounts of data for quick access.
  • Management and Filtering Software: The Syslog Server needs help to automate the work, as well as to filter to view specific log messages. This software is able to extract specific parameters and filter logs as needed.

Message Components 

The Syslog message format is divided into three parts:

  • PRI: A calculated Priority Value which details the message priority levels.
  • HEADER: Consists of two identifying fields which are the Timestamp and the Hostname (the machine name that sends the log).
  • MSG: This contains the actual message about the event that happened. It is UTF-8 encoded and is also divided into a TAG and a CONTENT field. The information includes event messages, severity, host IP addresses, diagnostics and more. 

More About PRI 

This is derived from two numeric values that help categorize the message, Facility Code and Severity Level. 

Facility Code: This value is one of 15 predefined codes or various locally defined values in the case of 16 to 23. These codes specify the type of program that is logging the message. Messages with different facilities may be handled differently. The list of facilities available is defined by the standard:

Facility CodeKeywordDescription
0kernKernel messages
1userUser-level messages
2mailMail system
3daemonSystem daemons
4authSecurity/authentication messages
5syslogMessages generated internally by syslogd
6lprLine printer subsystem
7newsNetwork news subsystem
8uucpUUCP subsystem
9cronClock daemon
10authprivSecurity/authentication messages
11ftpFTP daemon
12ntpNTP subsystem
13securityLog audit
14consoleLog alert
15solaris-cronScheduling daemon
16-23local0 – local7Locally-used facilities

The mapping between facility code and keyword is not uniform in different operating systems and Syslog implementations.

Severity Level: The second value of a Syslog message categorizes the importance or severity of the message in a numerical code from 0 to 7.

LevelSeverityDescription
0EmergencySystem is unusable
1AlertAction must be taken immediately
2CriticalCritical conditions
3ErrorError conditions
4WarningWarning conditions
5NoticeNormal but significant condition
6InformationalInformational messages
7DebugDebug-level messages

The PRI value is calculated by taking the Facility Code, multiplying it by eight and then adding the Severity Level. Messages are typically no longer than 1024 bytes.

Advantages

Syslog allows the separation of the software that generates messages, the system that stores them and the software that reports and analyzes them. Therefore it provides a way to ensure that critical events are logged and stored off the original server. An attacker’s first effort after compromising a system is usually to cover their tracks left in the logs. Logs forwarded via Syslog are out of reach.

Monitoring numerous logs from numerous systems is time consuming and impractical. Syslog helps solve this issue by forwarding those events to the centralized Syslog server, consolidating logs from multiple sources into a single location. 

While Syslog is not the best way to monitor the status of networked devices, it can be a good way to monitor the overall health of network equipment. Sudden spikes in event volume, for example, might indicate sudden traffic spikes. Learning about this at the edge of your system lets you get ahead of the problem before it happens. 

Syslog can be configured to forward authentication events to a Syslog server, without the overhead of having to install and configure a full monitoring agent. 

Limitations

Syslog does not include an authentication mechanism and is therefore weak on security. Therefore, it is possible for one machine to impersonate another machine and send fake log events. It is also susceptible to replay attacks.

Also, it is possible to lose Syslog messages because of its reliance on UDP transport. UDP is connectionless and not guaranteed, so messages could be lost due to network congestion or packet loss.

Another limitation of the Syslog protocol is that the device being monitored must be up and running and connected to the network to generate and send a message. A critical error from a server may never send an error at all if the system goes offline. Therefore, Syslog is not a good way to monitor the up and down status of devices.

Finally, although there are standards about the components of a message, there is a lack of consistency in terms of how message content is formatted. The protocol does not define standard message formatting. Some messages may be human readable, some not. Syslog just provides a method to transport the message.

Log Messages Best Practices

To help create the most useful Syslog messages possible, follow these best practices:

Use Parsable Log Formats

There is no universal structure for log messages. Working with large volumes of logs is almost impossible if you don’t have a way to automatically parse log entries to find what you’re searching for. Tools are far more likely to work with a parseable format.

One example is JSON, a structured-data log format that’s become the standard used for many logging applications. It is both machine and human-readable and is supported by most languages and runtimes. It also has the added benefit of being compact and efficient to parse.

Use a Logging Library or Framework

There are many logging libraries for programming languages and runtime environments. Whatever the language your app is developed with, use a compatible framework to transmit logs from your app or service to a Syslog server.

Standardized Formats

Set in the operating standards, the format or schema of the messages, for all users to follow. Standardizing the formats will mean less clutter in the logs and they become more searchable. Avoid long sentences and use standard abbreviations i.e use ‘ms’ for ‘milliseconds’. 

There should be non-negotiable fields in your logs. IP address, timestamp, whatever you need. It’s important to have basic fields that are always set, every time. Additionally, log formats without schemas are difficult to maintain as new logging code is added to your software, new team members join and new features are developed.

Knowing exactly what information needs to be embedded in log messages helps users write them and helps everyone else read them.

Include Identifiers 

Closely linked with using a format to precisely describe the log format is the best practice of using identifiers in the messages. Identifiers help identify where a message came from and figure out how multiple messages are related. For example, including a transaction or session ID in your log message allows you to link two separate errors to the same user session.

Include Syslog Severity Levels

Correctly using the most appropriate logging Severity Level when sending a message can make future troubleshooting easier. Allowing logging to be set at the wrong level and can cause monitoring issues creating false alarms or masking urgent issues. 

Include the Right Amount of Context

The best Syslog messages include all the relevant context to recreate the state of your application at the time of the logging call. This means adding the source of the problem in error messages and concise reasons for sending emergency log messages.

Avoid Multi-line Log Messages

The Syslog protocol specification allows multiple lines to be contained within a single log message, but this can cause some parsing issues. Line breaks in log lines aren’t friendly with every log analysis tool. For example, sed and grep commands, don’t handle searching for patterns across lines very well. Therefore, review and declutter the messages following the agreed message format.

However, if you absolutely must include multiline messages then investigate using a cloud-based log aggregation tool such as Papertrail. This has the ability to find the separate parts of a single log message when it’s split across lines.

Don’t Log Sensitive Data

Never ever write any passwords to the log files. The same applies for sensitive data like credit card details, bank account details and personal information. Syslog messages are rarely encrypted at rest. A malicious attacker will be able to easily read them. 

Refine Your Logging Code

Another good practice is to review the logging code to:

  • Add more content in the Emergency, Alert, Critical, Error and Warning log statements. 
  • Keep the Notice, Informational and Debug messages short.
  • Log in decision points, don’t log inside short loops.

Common Tooling

Some of best Syslog tools for Linux and Windows include:

SolarWinds Kiwi Syslog Server 

One of best tools for collecting, viewing and archiving Syslog messages. It is a versatile, user friendly viewer with automated message responses. This tool is easy to install and generates reports in plain text or HTML.

The software handles Syslog and SNMP from Windows, Linux and UNIX hosts.

Logstash

Data from the centralized Syslog server can be forwarded to Logstash. This can perform further parsing and enrichment of the log data before sending it on to Elasticsearch. Here’s a guide with hands-on exercises for getting familiar with Syslog in Logstash.

LOGalyzer

LOGalyzer is another free open-source, centralized log management and network monitoring tool.

It supports Linux and Unix servers, network devices and Windows hosts, providing real-time event detection and extensive search capabilities.

Summary

Complete network monitoring requires using multiple tools. Syslog is an important tool in network monitoring because it ensures that events occurring without a dramatic effect do not fall through any monitoring gaps. The best practice is to use software that combines all the tools, so to always have an overview of what is happening in your network.

As Syslog is a standard protocol, many applications support sending data to Syslog. By centralizing this data, you can easily audit security, monitor application behavior and keep track of other important server information.

The Syslog log message format is supported by most programming tools and runtime environments so it’s a useful way to transmit and record log messages. Creating log messages with the right data requires users to think about the situations and to tailor the messages appropriately. Following best practices makes the job easier.

A Practical Guide to Logstash: Syslog Deep Dive

Syslog is a popular standard for centralizing and formatting log data generated by network devices. It provides a standardized way of generating and collecting log information, such as program errors, notices, warnings, status messages, and so on. Almost all Unix-like operating systems, such as those based on Linux or BSD kernels, use a Syslog daemon that is responsible for collecting log information and storing it. 

They’re usually stored locally, but they can also be streamed to a central server if the administrator wants to be able to access all logs from a single location. By default, port 514 and UDP are used for the transmission of Syslogs. 

Note: It’s recommended to avoid UDP whenever possible, as it doesn’t guarantee that all logs will be sent and received; when the network is unreliable or congested, some messages could get lost in transit.

For more security and reliability, port 6514 is often used with TCP connections and TLS encryption.

In this post, we’ll learn how to collect Syslog messages from our servers and devices with Logstash and send it to Elasticsearch. This will allow us to take advantage of its super-awesome powers of ingesting large volumes of data and then allowing us to quickly and efficiently search for what we need.

We’ll explore two methods. One involves using the Syslog daemon to send logs through a TCP connection to a central server running Logstash. The other method uses Logstash to monitor log files on each server/device and automatically index messages to Elasticsearch.

Getting Started

Let’s take a look at how typical syslog events look like. These are usually collected locally in a file named /var/log/syslog.

To display the first 10 lines, we’ll type:

sudo head -10 /var/log/syslog

Original image link

Let’s analyze how a syslog line is structured.

Original image link

We can see the line starts with a timestamp, including the month name, day of month, hour, minute and second at which the event was recorded. The next entry is the hostname of the device generating the log. Next is the name of the process that created the log entry, its process ID number, and, finally, the log message itself.

Logs are very useful when we want to monitor the health of our systems or debug errors. But when we have to deal with tens, hundreds, or even thousands of such systems, it’s obviously too complicated to log into each machine and manually look at syslogs. By centralizing all of them into Elasticsearch, it makes it easier to get a birds-eye view over all of the logged events, filter only what we need and quickly spot when a system is misbehaving.

Collecting syslog Data with Logstash

In this post, we’ll explore two methods with which we can get our data into Logstash logs, and ultimately into an Elasticsearch index:

  1. Using the syslog service itself to forward logs to Logstash, via TCP connections.
  2. Configuring Logstash to monitor log files and collect their contents as soon as they appear within those files.

Forwarding Syslog Messages to Logstash via TCP Connections

The syslog daemon has the ability to send all the log events it captures to another device, through a TCP connection. Logstash, on the other hand, has the ability to open up a TCP port and listen for incoming connections, looking for syslog data. Sounds like a perfect match! Let’s see how to make them work together.

For simplicity, we will obviously use the same virtual machine to send the logs and also collect them. But in a real-world scenario, we would configure a separate server with Logstash to listen for incoming connections on a TCP port. Then, we would configure the syslog daemons on all of the other servers to send their logs to the Logstash instance.

Important: In this exercise, we’re configuring the syslog daemon first, and Logstash last, since we want the first captured logged events to be the ones we intentionally generate. But in a real scenario, configure Logstash listening on the TCP port first. This is to ensure that when you later configure the syslog daemons to send their messages, Logstash is ready to ingest them. If Logstash isn’t ready, the log entries sent while you configure it, won’t make it into Elasticsearch.

We will forward our syslogs to TCP port 10514 of the virtual machine. Logstash will listen to port 10514 and collect all messages.

Let’s edit the configuration file of the syslog daemon.

sudo nano /etc/rsyslog.d/50-default.conf

Above the line “#First some standard log files. Log by facility” we’ll add the following:

*.*                         @@127.0.0.1:10514

Original image link here

*.* indicates to forward all messages. @@  instructs the rsyslog utility to transmit data through TCP connections.

To save the config file, we press CTRL+X, after which we type Y and finally press ENTER.

We’ll need to restart the syslog daemon (called “rsyslogd”) so that it picks up on our desired changes.

sudo systemctl restart rsyslog.service

If you don’t have a git tool available on your test system, you can install it with:

sudo apt update && sudo apt install git

Now let’s clone the repo which contains the configuration files we’ll use with Logstash.

sudo git clone https://github.com/coralogix-resources/logstash-syslog.git /etc/logstash/conf.d/logstash-syslog

Let’s take a look at the log entries generated by the “systemd” processes.

sudo grep "systemd" /var/log/syslog

Original image link here

We’ll copy one of these lines and paste it to the https://grokdebug.herokuapp.com/ website, in the first field, the input section.

Original image link here

Now, in a new web browser tab, let’s take a look at the following Logstash configuration: https://raw.githubusercontent.com/coralogix-resources/logstash-syslog/master/syslog-tcp-forward.conf.

Original image link

We can see in the highlighted “input” section how we instruct Logstash to listen for incoming connections on TCP port 10514 and look for syslog data.

To test how the Grok pattern we use in this config file matches our syslog lines, let’s copy it

%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}

and then paste it to the https://grokdebug.herokuapp.com/ website, in the second field, the pattern section.

Original image link

We can see every field is perfectly extracted.

Now, let’s run Logstash with this configuration file.

sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-syslog/syslog-tcp-forward.conf

Since logs are continuously generated and collected, we won’t stop Logstash this time with CTRL+C. We’ll just leave it running until we see this:

Original image link

Specifically, we’re looking for the “Successfully started Logstash” message.

Let’s leave Logstash running in the background, collecting data. Leave its terminal window open (so you can see it catching syslog events) and open up a second terminal window to enter the next commands.

It’s very likely that at this point no syslog events have been collected yet, since we just started Logstash. Let’s make sure to generate some log entries first. A simple command such as

sudo ls

will ensure we’ll generate a few log messages. We’ll be able to see in the window where Logstash is running that sudo generated some log entries and these have been added to the Elasticsearch index.

Let’s take a look at an indexed log entry.

curl -XGET "https://localhost:9200/syslog-received-on-tcp/_search?pretty" -H 'Content-Type: application/json' -d'{"size": 1}'

The output we’ll get will contain something similar to this:

        {
        "_index" : "syslog-received-on-tcp",
        "_type" : "_doc",
        "_id" : "fWJ7QXMB9gZX17ukIc6D",
        "_score" : 1.0,
        "_source" : {
          "received_at" : "2020-07-12T05:24:14.990Z",
          "syslog_message" : " student : TTY=pts/1 ; PWD=/home/student ; USER=root ; COMMAND=/bin/ls",
          "syslog_timestamp" : "2020-07-12T05:24:14.000Z",
          "message" : "<85>Jul 12 08:24:14 coralogix sudo:  student : TTY=pts/1 ; PWD=/home/student ; USER=root ; COMMAND=/bin/ls",
          "syslog_hostname" : "coralogix",
          "port" : 51432,
          "type" : "syslog",
          "@timestamp" : "2020-07-12T05:24:14.990Z",
          "host" : "localhost",
          "@version" : "1",
          "received_from" : "localhost",
          "syslog_program" : "sudo"
        }

Awesome! Everything worked perfectly. Now let’s test out the other scenario.

Monitoring syslog Files with Logstash

We’ll first need to stop the Logstash process we launched in the previous section. Switch to the terminal where it is running and press CTRL+C to stop it.

Let’s open up this link in a browser and take a look at the Logstash config we’ll use this time: https://raw.githubusercontent.com/coralogix-resources/logstash-syslog/master/logstash-monitoring-syslog.conf.

Original image link

We can see that the important part here is that we tell it to monitor the “/var/log/syslog” file.

Let’s run Logstash with this config.

sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-syslog/logstash-monitoring-syslog.conf

As usual, we’ll wait until it finishes its job and then press CTRL+C to exit the process.

Let’s see the data that has been parsed.

curl -XGET "https://localhost:9200/syslog-monitor/_search?pretty" -H 'Content-Type: application/json' -d'{"size": 1}'

We will get an output similar to this:

        {
        "_index" : "syslog-monitor",
        "_type" : "_doc",
        "_id" : "kmKYQXMB9gZX17ukC878",
        "_score" : 1.0,
        "_source" : {
          "type" : "syslog",
          "@version" : "1",
          "syslog_message" : " [origin software="rsyslogd" swVersion="8.32.0" x-pid="448" x-info="https://www.rsyslog.com"] rsyslogd was HUPed",
          "syslog_hostname" : "coralogix",
          "message" : "Jul 12 05:52:46 coralogix rsyslogd:  [origin software="rsyslogd" swVersion="8.32.0" x-pid="448" x-info="https://www.rsyslog.com"] rsyslogd was HUPed",
          "received_at" : "2020-07-12T05:55:49.644Z",
          "received_from" : "coralogix",
          "host" : "coralogix",
          "syslog_program" : "rsyslogd",
          "syslog_timestamp" : "2020-07-12T02:52:46.000Z",
          "path" : "/var/log/syslog",
          "@timestamp" : "2020-07-12T05:55:49.644Z"
        }

Clean-Up Steps

To clean up what we created in this exercise, we just need to delete the two new indexes that we added

curl -XDELETE "https://localhost:9200/syslog-received-on-tcp/"

curl -XDELETE "https://localhost:9200/syslog-monitor/"

and also delete the directory where we placed our Logstash config files.

sudo rm -r /etc/logstash/conf.d/logstash-syslog

Conclusion

As you can see, it’s fairly easy to gather all of your logs in a single location, and the advantages are invaluable. For example, besides making everything more accessible and easier to search, think about servers failing. It happens a little bit more often than we like. If logs are kept on the server, once it fails, you lose the logs. Or, another common scenario, is that hackers delete logs once they compromise a machine. By collecting everything into Elasticsearch, though, you’ll have the original logs, untouched and ready to review to see what happened before the machine experienced problems.