grok filter (regex) to extract string within square brackets

VinothNair picture VinothNair · Jun 25, 2015 · Viewed 12.7k times · Source

My application log entries are given below:

2015-06-24 14:03:16.7288  Sent request message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74] <Request>sometext</Request>

2015-06-24 14:38:05.2460  Received response message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74] <Response>sometext</Response>

I am using logstash grok filter to extract the xml content and the client token with the square bracket.

grok {  
    match => ["message", "(?<content>(<Request(.)*?</Request>))"]   
    match => ["message", "(?<clienttoken>(Sent request message \[(.)*?\]))"]
    add_tag => "Request"
    break_on_match => false
    tag_on_failure => [ ]
}

grok {  
    match => ["message", "(?<content>(<Response(.)*?</Response>))"] 
    match => ["message", "(?<clienttoken>(Received response message \[(.)*?\]))"]
    add_tag => "Response"
    break_on_match => false
    tag_on_failure => [ ]
}

Now the result looks like below

For the first log line:

Content =  <Request>sometext</Request>
clienttoken = Sent request message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74]

For the second log line:

Content = <Response>sometext</Response>
clienttoken = Received response message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74]

But I want the result to be like this:

Content = <Request>sometext</Request>
clienttoken = 649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74

Please let me know how to extract only the strings within the square bracket without all the matching string in the pattern.

Answer

Avinash Raj picture Avinash Raj · Jun 25, 2015

You may use lookbehind and lookahead assertions.

(?<=Sent request message \[).*?(?=\])

likewise do the same for response message.