user2476714
user2476714

Reputation: 689

How do I invert grep matches followed by some xml body in logs

I want to grep out ( invert grep / reverse grep ) some xml data from my log files. grep -v works , but i have some giant xml body that gets printed every 5 minutes in my logs and I want to grep that whole xml body out, because it is mostly noise and makes it harder for me to parse the log file. The lines i want to grep out looks something like this.

date time pattern
<xml-tag>
   <smaller tag1>
   <smaller tag2>
   </smaller tag1>
   </smaller tag2>
 </xml-tag>

When I try something like

grep -v "pattern" log-file

the pattern gets grepped out . But the xml gets left behind. Anyway to grep out all contents starting from to all contents ending in ? I also tried

egrep -v -A6 "pattern" log-file 

as a last ditch effort . I assuming this should grep out 6 lines after the match. That didn't work either for some reason.

Upvotes: 1

Views: 196

Answers (3)

potong
potong

Reputation: 58548

This might work for you (GNU sed):

sed '/pattern/,/<\/xml-tag>/d' logfile

Upvotes: 0

Jotne
Jotne

Reputation: 41460

Here is an awk that not use the range function:

awk '/pattern/ {f=1} !f; /<\/xml-tag>/ {f=0}' file
before1
after1

Range function is nice, but less flexible if other test are to be done.

Upvotes: 0

John1024
John1024

Reputation: 113964

To skip all lines between pattern and the closing xml tag:

awk '/pattern/,/<\/xml-tag>/ {next} 1' logfile

With this as the logfile:

$ cat logfile
before1
date time pattern
<xml-tag>
   <smaller tag1>
   <smaller tag2>
   </smaller tag1>
   </smaller tag2>
 </xml-tag>
after1

The output produced is:

$ awk '/pattern/,/<\/xml-tag>/ {next} 1' logfile
before1
after1

How it works

In awk, the expression /pattern/,/<\/xml-tag>/ defines a range of lines that starts with a line that has pattern in it and ends with the next line that has <\/xml-tag> in it. For any such line, the commend next is executed, meaning that the line is not printed and awk starts processing the next line. If next is not executed, in other words we are not in the unwanted block, then the 1 is executed which, in awk, is short-form for "print this line".

Upvotes: 4

Related Questions