rs79
rs79

Reputation: 2321

Grep a Log file for the last occurrence of a string between two strings

I have a log file trace.log. In it I need to grep for the content contained within the strings <tag> and </tag>. There are multiple sets of this pair of strings, and I just need to return the content between last set (in other words, from the tail of the log file).

Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?

Thanks for looking.

EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...

Upvotes: 26

Views: 48305

Answers (5)

Vorsprung
Vorsprung

Reputation: 34307

perl -e '$/=undef; $f=<>; push @a,$1 while($f=~m#<tag>(.*?)</tag>#msg); print $a[-1]' ex.txt

Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?

perl -e '$/=undef; $f=<>; push @a,$1 while($f=~m#<tag>(.*?)</tag>#msg); print $a[-1] if ($a[-1]~=/teststring/);' ex.txt

Upvotes: 0

SlackGadget
SlackGadget

Reputation: 557

If like me, you don't have access to tac because your sysadmin won't play ball you can try:

grep pattern file | tail -1

Upvotes: 26

pfnuesel
pfnuesel

Reputation: 15300

Another solution than grep would be sed:

tac file | sed -n '0,/<tag>\(.*\)<\/tag>/s//\1/p'

tac file prints the file in the reverse order (cat backwards), then sed proceeds from input line 0 to the first occurence of <tag>.*<\tag>, and substitutes <tag>.*<\tag> with only the part that was inside <tag>. The p flag prints the output, which was suppressed by -n.

Edit: This does not work if <tag> and </tag> are on different lines. We can still use sed for that:

tac file | sed -n '/<\/tag>/,$p; /<tag>/q' | sed 's/.*<tag>//; s/<\/tag>.*//' | tac

Again we use tac to read the file backwards, then the first sed command reads from the first occurrence of and quits when it finds . Only the lines in between are printed. Then we pass it to another sed process to strip the 's and finally reverse the lines again with tac.

Upvotes: 1

fedorqui
fedorqui

Reputation: 289515

Use tac to print the file the other way round and then grep -m1 to just print one result. The look behind and look ahead checks text in between <tag> and </tag>.

tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'

Test

Given this file

$ cat a
<tag> and </tag>
aaa <tag> and <b> other things </tag>
adsaad <tag>and  last one</tag>

$ tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'
and  last one

Update

EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...

Then it is a bit more tricky:

tac file | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]};
                /<tag>/   {p=0; split($0, a, "<tag>");  $0=a[2]; print; exit};
                p' | tac

The idea is to reverse the file and use a flag p to check if the <tag> has appeared yet or not. It will start printing when </tag> appears and finished when <tag> comes (because we are reading the other way round).

  • split($0, a, "</tag>"); $0=a[1]; gets the data before </tag>
  • split($0, a, "<tag>" ); $0=a[2]; gets the data after <tag>

Test

Given a file a like this:

<tag> and </tag>
aaa <tag> and <b> other thing
come here
and here </tag>

some text<tag>tag is starting here
blabla
and ends here</tag>

The output will be:

$ tac a | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]}; /<tag>/ {p=0; split($0, a, "<tag>"); $0=a[2]; print; exit}; p' | tac
tag is starting here
blabla
and ends here

Upvotes: 35

mpez0
mpez0

Reputation: 2883

A little untested awk that handles multiple lines:

awk '
    BEGIN    {retain="false"}
    /<\tag>/ {retain = retain + $0; keep="false"; next}
    /<tag>/  {keep = "true"; retain = $0; next}
    keep == "true" {retain = retain + $0}
    END {print retain}
' filename

We start just reading the file; when we hit the , we start keeping lines. When we hit the , we stop. If we hit another , we clear the retained string and start again. If you want all the strings, print at each

Upvotes: 0

Related Questions