[Live Webinar] Next-Level O11y: Why Every DevOps Team Needs a RUM Strategy Register today!

Instantly Parse The Top 12 Log Types with Coralogix

  • Mary Mats
  • August 13, 2019
Share article

Throughout the past few months, I had the opportunity to work with and serve hundreds of Coralogix’s customers, the challenges in performing efficient Log Analytics are numerous, from log monitoring, collecting, searching, visualizing, and alerting. What I have come to learn is that at the heart of each and every one of these challenges laid the challenge of data parsing. JSON structured logs are easier to read, easier to search, alert, and visualize. They can be queried using the ES API’s, exported to Excel sheets, and even be displayed in Grafana.  So why is it that a lot of logs are still plain text by default and not structured?

As our focus here in Coralogix was always about our customers and their needs, we developed a parsing engine that allows a single UI to parse, extract, mask, and even exclude log entries in-app, or via API. To get you started with log parsing,  we created pre-defined parsing rules for the 12 most common logs on the web.

In this post, we collected the following log templates and created their own Named group REGEX in order to parse them into JSON structure logs in Coralogix: Apache logs, IIS, logs, MongoDB logs, ELB logs, ALB logs, CloudFront logs, Mysql logs, access logs, Nginx logs, Http headers, user agent field, java stack trace.

Note that every regex is submitted as a recommendation, of course logs can have different configurations and permutations, you can easily adjust the parsing rules below to your needs, more on named group regex here.

1. User Agent (Use an “Extract” rule in Coralogix):

https://regex101.com/r/pw0YeT/3

Sample Log

Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405

Regular Expression

(?P<mozillaVersion>Mozilla/[0-9.]+) ((?P<sysInfo>[^)]+))(?: (?P<platform>[^ ]+))?(?: ((?P<platformInfo>[^)]+)))?(?: (?P<extentions>[^n]+))?

Results

{ 
  "extentions" : "Mobile/7B405" ,
  "platformInfo" : "KHTML, like Gecko" ,
  "sysInfo" : "iPad; U; CPU OS 3_2_1 like Mac OS X; en-us" ,
  "mozillaVersion" : "Mozilla/5.0" ,
  "platform" : "AppleWebKit/531.21.10"
}

2. Cloud-Front (Use a “Parse” rule in Coralogix):

https://regex101.com/r/q2DmKi/4

Sample Log

2014-05-23 01:13:11 FRA2 182 192.0.2.10 GET d111111abcdef8.cloudfront.net /view/my/file.html 200 www.displaymyfiles.com Mozilla/4.0%20(compatible;%20MSIE%205.0b1;%20Mac_PowerPC) - zip=98101 RefreshHit MRVMF7KydIvxMWfJIglgwHQwZsbG2IhRJ07sn9AkKUFSHS9EXAMPLE== d111111abcdef8.cloudfront.net http - 0.001 - - - RefreshHit HTTP/1.1 Processed 1

Regular Expression

(?P<date_time>[0-9]{4}-[0-9]{2}-[0-9]{2}\s*[0-9]{2}:[0-9]{2}:[0-9]{2}) (?P<x_edge_location>[^ ]+) (?P<sc_bytes>[0-9]+) (?P<c_ip>[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}) (?P<cs_method>[^ ]+) (?P<cs_host>[^ ]+) (?P<cs_uri_stem>[^ ]+) (?P<sc_status>[0-9]+) (?P<cs_referer>[^ ]+) (?P<cs_user_agent>[^ ]+) (?P<cs_uri_query>[^ ]+) (?P<cs_cookie>[^ ]+) (?P<x_edge_result_type>[^ ]+) (?P<x_edge_request_id>[^ ]+) (?P<x_host_header>[^ ]+) (?P<cs_protocol>[^ ]+) (?P<cs_bytes>[^ ]+) (?P<time_taken>[^ ]+) (?P<x_forwarded_for>[^ ]+) (?P<ssl_protocol>[^ ]+) (?P<ssl_cipher>[^ ]+) (?P<x_edge_response_result_type>[^ ]+) (?P<cs_protocol_version>[^ ]+) (?P<fle_status>[^ ]+) (?P<fle_encrypted_fields>[^n]+)

Results

{ 
  "x_edge_location" : "FRA2" , 
  "cs_method" : "GET" , 
  "x_edge_result_type" : "RefreshHit" , 
  "ssl_cipher" : "-" ,
  "cs_uri_stem" : "/view/my/file.html" , 
  "cs_uri_query" : "-" ,
  "x_edge_request_id" : "MRVMF7KydIvxMWfJIglgwHQwZsbG2IhRJ07sn9AkKUFSHS9EXAMPLE==" , 
  "sc_status" : "200" , 
  "date_time" : "2014-05-23 01:13:11" ,
  "sc_bytes" : "182" , 
  "cs_protocol_version" : "HTTP/1.1" ,
  "cs_protocol" : "http" , 
  "cs_cookie" : "zip=98101" , 
  "ssl_protocol" : "-" ,
  "fle_status" : "Processed" ,
  "cs_user_agent" : "Mozilla/4.0%20(compatible;%20MSIE%205.0b1;%20Mac_PowerPC)" ,
  "cs_host" : "d111111abcdef8.cloudfront.net" ,
  "cs_bytes" : "-" ,
  "x_edge_response_result_type" : "RefreshHit" ,
  "fle_encrypted_fields" : "1" ,
  "c_ip" : "192.0.2.10" ,
  "time_taken" : "0.001" ,
  "x_forwarded_for" : "-" ,
  "x_host_header" : "d111111abcdef8.cloudfront.net" ,
  "cs_referer" : "www.displaymyfiles.com"
 }

 

3. ELB (Elastic Load Balancer) – (Use a “Parse” rule in Coralogix):

https://regex101.com/r/T52klJ/1

Sample Log

2015-05-13T23:39:43.945958Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000086 0.001048 0.001337 200 200 0 57 "GET https://www.example.com:443/ HTTP/1.1" "curl/7.38.0" DHE-RSA-AES128-SHA TLSv1.2

Regular Expression

(?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9A-Z]+) (?P<elbName>[0-9a-zA-Z-]+) (?P<clientPort>[0-9.:]+) (?P<backendPort>[0-9.:]+) (?P<request_processing_time>[.0-9-]+) (?P<response_processing_time>[.0-9]+) (?P<elb_status_code>[.0-9-]+) (?P<backend_status_code>[0-9-]+) (?P<received_bytes>[0-9-]+) (?P<sent_bytes>[0-9-]+) (?P<request>[0-9-]+) "(?P<user_agent>[^"]+)" "(?P<ssl_cipher>[^"]+)" (?P<ssl_protocol>[- A-Z0-9a-z.]+)

Results

{ 
  "received_bytes" : "200" , 
  "request" : "57" , 
  "elb_status_code" : "0.001337" , 
  "ssl_cipher" : "curl/7.38.0" , 
  "elbName" : "my-loadbalancer" ,
  "request_processing_time" : "0.000086" , 
  "sent_bytes" : "0" , 
  "response_processing_time" : "0.001048" , 
  "backendPort" : "10.0.0.1:80" , 
  "backend_status_code" : "200" , 
  "clientPort" : "192.168.131.39:2817" , 
  "ssl_protocol" : "DHE-RSA-AES128-SHA TLSv1.2" , 
  "user_agent" : "GET https://www.example.com:443/ HTTP/1.1" , 
  "timestamp" : "2015-05-13T23:39:43.945958Z"
}

4. MongoDB (Use a “Parse” rule in Coralogix):

https://regex101.com/r/pBM9DO/1

Sample Log

2014-11-03T18:28:32.450-0500 I NETWORK [initandlisten] waiting for connections on port 27017

Regular Expression

(?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}-[0-9]{4}) (?P<severity>[A-Z]) [A-Z]+ *[[a-zA-Z0-9]+] (?P<message>[^n]+)

Results

{ 
  "severity" : "I" , 
  "message" : "waiting for connections on port 27017" ,
  "timestamp" : "2014-11-03T18:28:32.450-0500" 
}

5. NSCA access logs (Use a “Parse” rule in Coralogix):

https://regex101.com/r/Iuos8u/1/

Sample Log

172.21.13.45 - MicrosoftJohnDoe [07/Apr/2004:17:39:04 -0800] "GET /scripts/iisadmin/ism.dll?http/serv HTTP/1.0" 200 3401

Regular Expression

(?P<clientIP>[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})s*(?P<userIdentidier>[^ ]+) (?P<userID>[^ ]+) [(?P<timestamp>[^]]+)] "(?P<clientRequest>[^"]+)" (?P<statusCode>[0-9-]+) (?P<numBytes>[0-9-]+)

Results

{ 
   "numBytes" : "3401" ,
   "userIdentidier" : "-" ,
   "clientIP" : "172.21.13.45" ,
   "userID" : "Microsoft\JohnDoe" ,
   "statusCode" : "200" , 
   "timestamp" : "07/Apr/2004:17:39:04 -0800" , 
   "clientRequest" : "GET /scripts/iisadmin/ism.dll?http/serv HTTP/1.0"
}

6. Java Stacktrace (Use an “Extract” rule in Coralogix):

https://regex101.com/r/ZAAuBW/2

Sample Log

Exception in thread "main" java.lang.NullPointerException 
at com.example.myproject.Book.getTitle(Book.java:16) 
at com.example.myproject.Author.getBookTitles(Author.java:25)
at com.example.myproject.Bootstrap.main(Bootstrap.java:14)

Regular Expression

Exception(?: in thread) "(?P<threadName>[^"]+)" (?P<changethenamelater>.*)s+(?P<stackeholder>(.|n)*)

Results

{ 
  "changethenamelater" : "java.lang.NullPointerException " ,
  "stackeholder" : "at com.example.myproject.Book.getTitle(Book.java:16) 
                    at com.example.myproject.Author.getBookTitles(Author.java:25) 
                    at com.example.myproject.Bootstrap.main(Bootstrap.java:14)" ,
  "threadName" : "main" 
}

7. Basic HTTP Headers (Use a “Extract” rule in Coralogix):

https://regex101.com/r/JRYot3/1

Sample Log

GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1

Regular Expression

(?P<method>[A-Z]+) (?P<path>[^ ]+) (?P<protocol>[A-Z0-9./]+)

Results

{ 
  "path" : "/tutorials/other/top-20-mysql-best-practices/" , 
  "protocol" : "HTTP/1.1" , 
  "method" : "GET"
}

8. Nginx (Use a “Parse” rule in Coralogix):

https://regex101.com/r/yHA8Yh/1

Sample Loghttps://regex101.com/r/yHA8Yh/1

127.0.0.1 - dbmanager [20/Nov/2017:18:52:17 +0000] "GET / HTTP/1.1" 401 188 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"

Regular Expression

(?P<remoteAdd>[0-9.]+) - (?P<remoteUser>[a-zA-z]+) [(?P<timestamp>[^]]+)] "(?P<request>[^"]+)" (?P<status>[0-9]+) (?P<bodyBytesSent>[0-9]+) "(?P<httpReferer>[^"]+)" "(?P<httpUserAgent>[^"]+)"

Results

{ 
  "remoteUser" : "dbmanager" , 
  "request" : "GET / HTTP/1.1" ,
  "bodyBytesSent" : "188" , 
  "remoteAdd" : "127.0.0.1" , 
  "httpUserAgent" : "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101
   Firefox/47.0" ,  
  "httpReferer" : "-" , 
  "timestamp" : "20/Nov/2017:18:52:17 +0000" ,
  "status" : "401"
}

9. MySQL (Use a “Parse” rule in Coralogix):

https://regex101.com/r/NjtRLZ/4

Sample Log

2018-03-31T15:38:44.521650Z 2356 Query SELECT c FROM sbtest1 WHERE id=164802

Regular Expression

(?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6}Z) ? ? ? ? ? ?(?P<connID>[0-9]+) (?P<name>[a-zA-Z]+) (?P<sqltext>[^n]+)

Results

{
“sqltext” : “SELECT c FROM sbtest1 WHERE id=164802” ,
“name” : “Query” ,
“connID” : “2356” ,
“timestamp” : “2018-03-31T15:38:44.521650Z”
}

10. ALB (Use a “Parse” rule in Coralogix):

https://regex101.com/r/NjtRLZ/4

Sample Log

2018-03-31T15:38:44.521650Z 2356 Query SELECT c FROM sbtest1 WHERE id=164802 http 2018-11-30T22:23:00.186641Z app/my-loadbalancer/50dc6c495c0c9188 192.168.131.39:2817 - 0.000 0.001 0.000 502 - 34 366 "GET http://www.example.com:80/ HTTP/1.1" "curl/7.46.0" - - arn:aws:elasticloadbalancing:us-east-2:123456789012:targetgroup/my-targets/73e2d6bc24d8a067 "Root=1-58337364-23a8c76965a2ef7629b185e3" "-" "-" 0 2018-11-30T22:22:48.364000Z "forward" "-" "LambdaInvalidResponse"

Regular Expression

(?P<type>[a-z0-9]{2,5}) (?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6}Z) (?P<elb>[^n]+) ? ?(?P<clientPort>[0-9.:-]+) (?P<targetPort>[0-9.:-]+) (?P<requestProcessTime>[0-9.:]+) (?P<targetProcessTime>[0-9.:-]+) (?P<responseProcessingTime>[0-9.:-]+) (?P<elbStatusCode>[0-9-]+) (?P<targetStatus>[0-9-]+) (?P<recievedBytes>[0-9-]+) (?P<sentBytes>[0-9-]+) ? ?"(?P<request>[^"]+)" "(?P<userAgent>[^"]+)" (?P<sslCipher>[^ ]+) (?P<sslProtocol>[^n]+) ? ?(?P<targetGroupArn>[^n]+) ? ?"(?P<traceID>[^"]+)" "(?P<domainName>[^"]+)" "(?P<chosenCertArn>[^"]+)" ? ?(?P<matchedRulePriority>[^ ]+) (?P<requestCreationTime>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6}Z) "(?P<actionsExecuted>[^"]+)" "(?P<redirectURL>[^"]+)" "(?P<errorReason>[^"]+)"

Results

{

“traceID” : “Root=1-58337364-23a8c76965a2ef7629b185e3” ,
“request” : “GET http://www.example.com:80/ HTTP/1.1” ,
“requestCreationTime” : “2018-11-30T22:22:48.364000Z” ,
“redirectURL” : “-” , “targetGroupArn” : ” ” ,
“type” : “http” , “targetPort” : “-” ,
“responseProcessingTime” : “0.000” ,
“targetProcessTime” : “0.001” ,
“chosenCertArn” : “-” ,
“errorReason” : “LambdaInvalidResponse” ,
“matchedRulePriority” : “0” ,
“actionsExecuted” : “forward” ,
“clientPort” : “7” ,
“elb” : “app/my-loadbalancer/50dc6c495c0c9188 192.168.131.39:281” ,
“targetStatus” : “-” ,
“recievedBytes” : “34” ,
“timestamp” : “2018-11-30T22:23:00.186641Z” ,
“sslCipher” : “-” ,
“userAgent” : “curl/7.46.0” ,
“requestProcessTime” : “0.000” ,
“domainName” : “-” ,
“elbStatusCode” : “502” ,
“sslProtocol” : “- arn:aws:elasticloadbalancing:us-east-2:123456789012:targetgroup/my-targets/73e2d6bc24d8a067” ,          “sentBytes” : “366”
}

11. IIS (Use a “Parse” rule in Coralogix):

https://regex101.com/r/ytJdyE/2

Sample Log

192.168.114.201, -, 03/20/05, 7:55:20, W3SVC2, SERVER, 172.21.13.45, 4502, 163, 3223, 200, 0, GET, /DeptLogo.gif, -,

Regular Expression

(?P<clientIP>[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}), (?P<userName>[^,]+), (?P<timestamp>[[0-9]{2}/[0-9]{2}/[0-9]{2}, [0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}), (?P<serviceInstance>[^,]+), (?P<serverName>[^,]+), (?P<serverIP>[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}), (?P<timeTaken>[^,]+), (?P<clientBytesSent>[^,]+), (?P<serverBytesSent>[^,]+), (?P<serviceStatusCode>[^,]+), (?P<windowsStatusCode>[^,]+), (?P<requestType>[^,]+), (?P<targetOfOperation>[^,]+), (?P<parameters>[^,]+),

Results

{
“requestType” : “GET” ,
“windowsStatusCode” : “0” ,
“serverName” : “SERVER” ,
“userName” : “-” ,
“timeTaken” : “4502” ,
“serverBytesSent” : “3223” ,
“clientIP” : “192.168.114.201” ,
“serverIP” : “172.21.13.45” ,
“serviceInstance” : “W3SVC2” ,
“parameters” : “-” ,
“targetOfOperation” : “/DeptLogo.gif” ,
“serviceStatusCode” : “200” ,
“clientBytesSent” : “163” ,
“timestamp” : “03/20/05, 7:55:20”
}

12. Apache (Use a “Parse” rule in Coralogix):

https://regex101.com/r/2pwM6J/1

Sample Log

127.0.0.1 – frank [10/Oct/2000:13:55:36 -0700] “GET /apache_pb.gif HTTP/1.0” 200 2326

Regular Expression

(?P<clientIP>[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}) (?P<identd>[^ ]+) (?P<userid>[^ ]+) [(?P<timesptamp>[^]]+)] "(?P<request>[^"]+)" (?P<statusCode>[^ ]+) (?P<objectSize>[^ ]+)

Results

{
“request” : “GET /apache_pb.gif HTTP/1.0” ,
“timesptamp” : “10/Oct/2000:13:55:36 -0700” ,
“objectSize” : “2326” ,
“clientIP” : “127.0.0.1” ,
“identd” : “-” ,
“userid” : “frank” ,
“statusCode” : “200”
}

Where Modern Observability
and Financial Savvy Meet.

Live Webinar
Next-Level O11y: Why Every DevOps Team Needs a RUM Strategy
April 30th at 12pm ET | 6pm CET
Save my Seat