parsing white spaces in grok

lightweight picture lightweight · Apr 10, 2017 · Viewed 7.7k times · Source

I'm having some issue with white spaces in grok...

I have strings that look like this:

1491783364087   group-segmentation-service-master asdf-replica-sync-dev         5          55              55              0               consumer-1_ip-34-25-65.companya.com/10.34.25.65

I'm trying to parse them with grok with something like this:

%{NUMBER:poll_time} +%{WORD:consumer_group} +%{WORD:topic} +%{NUMBER:partition} +%{NUMBER:current_offset} +%{NUMBER:log_end_offset} +%{NUMBER:lag}

but I think I'm having issues accounting for the white spaces...

I've been trying to test various patterns in this: http://grokdebug.herokuapp.com/

but haven't had much luck...

Answer

Fairy picture Fairy · Apr 12, 2017

You can use the grok token %{SPACE} to account for whitespaces. Also the token %{WORD} won't match your consumer group and topic, because the according regex is \w which translates to [A-Za-z0-9_] (alphanumeric with undescore). The closest thing to match it would be to use %{NOSPACE}.

Something like this should work:

%{NUMBER:poll_time}%{SPACE}%{NOTSPACE:consumer_group}%{SPACE}%{NOTSPACE:topic}%{SPACE}%{NUMBER:partion}%{SPACE}%{NUMBER:current_offset}%{SPACE}%{NUMBER:log_end_offset}%{SPACE}%{NUMBER:lag}