Reputation: 2321
I have a log file trace.log
. In it I need to grep for the content contained within the strings <tag>
and </tag>
. There are multiple sets of this pair of strings, and I just need to return the content between last set (in other words, from the tail
of the log file).
Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?
Thanks for looking.
EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...
Upvotes: 26
Views: 48305
Reputation: 34307
perl -e '$/=undef; $f=<>; push @a,$1 while($f=~m#<tag>(.*?)</tag>#msg); print $a[-1]' ex.txt
Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?
perl -e '$/=undef; $f=<>; push @a,$1 while($f=~m#<tag>(.*?)</tag>#msg); print $a[-1] if ($a[-1]~=/teststring/);' ex.txt
Upvotes: 0
Reputation: 557
If like me, you don't have access to tac because your sysadmin won't play ball you can try:
grep pattern file | tail -1
Upvotes: 26
Reputation: 15300
Another solution than grep would be sed:
tac file | sed -n '0,/<tag>\(.*\)<\/tag>/s//\1/p'
tac file
prints the file in the reverse order (cat
backwards), then sed
proceeds from input line 0
to the first occurence of <tag>.*<\tag>
, and substitutes <tag>.*<\tag>
with only the part that was inside <tag>
. The p
flag prints the output, which was suppressed by -n
.
Edit: This does not work if <tag>
and </tag>
are on different lines. We can still use sed
for that:
tac file | sed -n '/<\/tag>/,$p; /<tag>/q' | sed 's/.*<tag>//; s/<\/tag>.*//' | tac
Again we use tac
to read the file backwards, then the first sed
command reads from the first occurrence of and quits when it finds . Only the lines in between are printed. Then we pass it to another sed
process to strip the 's and finally reverse the lines again with tac
.
Upvotes: 1
Reputation: 289515
Use tac
to print the file the other way round and then grep -m1
to just print one result. The look behind and look ahead checks text in between <tag>
and </tag>
.
tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'
Given this file
$ cat a
<tag> and </tag>
aaa <tag> and <b> other things </tag>
adsaad <tag>and last one</tag>
$ tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'
and last one
EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...
Then it is a bit more tricky:
tac file | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]};
/<tag>/ {p=0; split($0, a, "<tag>"); $0=a[2]; print; exit};
p' | tac
The idea is to reverse the file and use a flag p
to check if the <tag>
has appeared yet or not. It will start printing when </tag>
appears and finished when <tag>
comes (because we are reading the other way round).
split($0, a, "</tag>"); $0=a[1];
gets the data before </tag>
split($0, a, "<tag>" ); $0=a[2];
gets the data after <tag>
Given a file a
like this:
<tag> and </tag>
aaa <tag> and <b> other thing
come here
and here </tag>
some text<tag>tag is starting here
blabla
and ends here</tag>
The output will be:
$ tac a | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]}; /<tag>/ {p=0; split($0, a, "<tag>"); $0=a[2]; print; exit}; p' | tac
tag is starting here
blabla
and ends here
Upvotes: 35
Reputation: 2883
A little untested awk that handles multiple lines:
awk '
BEGIN {retain="false"}
/<\tag>/ {retain = retain + $0; keep="false"; next}
/<tag>/ {keep = "true"; retain = $0; next}
keep == "true" {retain = retain + $0}
END {print retain}
' filename
We start just reading the file; when we hit the , we start keeping lines. When we hit the , we stop. If we hit another , we clear the retained string and start again. If you want all the strings, print at each
Upvotes: 0