Syslog forwared HAProxy logs filtering in Logstash

Repox picture Repox · Dec 8, 2014 · Viewed 9.2k times · Source

I'm having issues understanding how to do this correctly.

I have the following Logstash config:

input {
  lumberjack {
    port => 5000
    host => "127.0.0.1"
    ssl_certificate => "/etc/ssl/star_server_com.crt"
    ssl_key => "/etc/ssl/server.key"
    type => "somelogs"
 }
}

output {
  elasticsearch {
    protocol => "http"
    host => "es01.server.com"
  }
}

With logstash-forwarder, I'm pushing my haproxy.log file generated by syslog to logstash. Kibana then shows me a _source which looks like this:

{"message":"Dec 8 11:32:20 localhost haproxy[5543]: 217.116.219.53:47746 [08/Dec/2014:11:32:20.938] es_proxy es_proxy/es02.server.com 0/0/1/18/20 200 305 - - ---- 1/1/1/0/0 0/0 \"GET /_cluster/health HTTP/1.1\"","@version":"1","@timestamp":"2014-12-08T11:32:21.603Z","type":"syslog","file":"/var/log/haproxy.log","host":"haproxy.server.com","offset":"4728006"}

Now, this has to be filtered (somehow) and I have to admit I haven't got the slightest idea how.
Looking at the grok documentation and fiddling with the grok debugger I still haven't got anything useful out of Logstash and Kibana.

I've been scanning the patterns directory and their files, and I can't say I understand how to use them. I was hoping that providing a filter with a haproxy pattern Logstash would match the pattern from my _source but that was without any luck.

Answer

Magnus Bäck picture Magnus Bäck · Dec 8, 2014

You're in luck since there already is a predefined grok pattern that appears to parse this exact type of log. All you have to do is refer to it in a grok filter:

filter {
  grok {
    match => ["message", "%{HAPROXYHTTP}"]
  }
}

%{HAPROXYHTTP} will be recursively expanded according to the pattern definition and each interesting piece in every line of input will be extracted to its own field. You may also want to remove the 'message' field after a successful application of the grok filter since it contains redundant data anyway; just add remove_field => ["message"] to the grok filter declaration.