Grep a Log file for the last occurrence of a string between two strings

rs79 picture rs79 · Oct 30, 2013 · Viewed 46k times · Source

I have a log file trace.log. In it I need to grep for the content contained within the strings <tag> and </tag>. There are multiple sets of this pair of strings, and I just need to return the content between last set (in other words, from the tail of the log file).

Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?

Thanks for looking.

EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...

Answer

fedorqui &#39;SO stop harming&#39; picture fedorqui 'SO stop harming' · Oct 30, 2013

Use tac to print the file the other way round and then grep -m1 to just print one result. The look behind and look ahead checks text in between <tag> and </tag>.

tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'

Test

Given this file

$ cat a
<tag> and </tag>
aaa <tag> and <b> other things </tag>
adsaad <tag>and  last one</tag>

$ tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'
and  last one

Update

EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...

Then it is a bit more tricky:

tac file | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]};
                /<tag>/   {p=0; split($0, a, "<tag>");  $0=a[2]; print; exit};
                p' | tac

The idea is to reverse the file and use a flag p to check if the <tag> has appeared yet or not. It will start printing when </tag> appears and finished when <tag> comes (because we are reading the other way round).

  • split($0, a, "</tag>"); $0=a[1]; gets the data before </tag>
  • split($0, a, "<tag>" ); $0=a[2]; gets the data after <tag>

Test

Given a file a like this:

<tag> and </tag>
aaa <tag> and <b> other thing
come here
and here </tag>

some text<tag>tag is starting here
blabla
and ends here</tag>

The output will be:

$ tac a | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]}; /<tag>/ {p=0; split($0, a, "<tag>"); $0=a[2]; print; exit}; p' | tac
tag is starting here
blabla
and ends here