Reputation: 689
I want to grep out ( invert grep / reverse grep ) some xml data from my log files. grep -v works , but i have some giant xml body that gets printed every 5 minutes in my logs and I want to grep that whole xml body out, because it is mostly noise and makes it harder for me to parse the log file. The lines i want to grep out looks something like this.
date time pattern
<xml-tag>
<smaller tag1>
<smaller tag2>
</smaller tag1>
</smaller tag2>
</xml-tag>
When I try something like
grep -v "pattern" log-file
the pattern gets grepped out . But the xml gets left behind. Anyway to grep out all contents starting from to all contents ending in ? I also tried
egrep -v -A6 "pattern" log-file
as a last ditch effort . I assuming this should grep out 6 lines after the match. That didn't work either for some reason.
Upvotes: 1
Views: 196
Reputation: 58548
This might work for you (GNU sed):
sed '/pattern/,/<\/xml-tag>/d' logfile
Upvotes: 0
Reputation: 41460
Here is an awk
that not use the range function:
awk '/pattern/ {f=1} !f; /<\/xml-tag>/ {f=0}' file
before1
after1
Range function is nice, but less flexible if other test are to be done.
Upvotes: 0
Reputation: 113964
To skip all lines between pattern and the closing xml tag:
awk '/pattern/,/<\/xml-tag>/ {next} 1' logfile
With this as the logfile:
$ cat logfile
before1
date time pattern
<xml-tag>
<smaller tag1>
<smaller tag2>
</smaller tag1>
</smaller tag2>
</xml-tag>
after1
The output produced is:
$ awk '/pattern/,/<\/xml-tag>/ {next} 1' logfile
before1
after1
In awk
, the expression /pattern/,/<\/xml-tag>/
defines a range of lines that starts with a line that has pattern
in it and ends with the next line that has <\/xml-tag>
in it. For any such line, the commend next
is executed, meaning that the line is not printed and awk
starts processing the next line. If next
is not executed, in other words we are not in the unwanted block, then the 1
is executed which, in awk
, is short-form for "print this line".
Upvotes: 4