Reputation: 261
I have log file that contains multiple rows for single log output like this:
DEBUG : <line1>
<line2>
TRACE : <line11>
<line12>
<line13>
DEBUG : <line21>
<line22>
<line23>
TRACE : <line31>
<line32>
ERROR : <line41>
<line42>
TRACE : <line51>
<line52>
DEBUG : <line61>
<line62>
I have to clean it from TRACE output.
I use
sed -e "/^TRACE/,/^DEBUG\|^ERROR/d" <log.txt
... and get
DEBUG : <line1>
<line2>
<line22>
<line23>
<line42>
<line62>
Sed removes range inclusively and doesn't print DEBUG and ERROR line just after TRACE block. I tried other way with sed, but didn't find how to remove TRACE blocks only.
Sed is pretty good but maybe I should use another Unix utility... Please, advise.
Upvotes: 0
Views: 760
Reputation: 3466
Here's a solution where sed's hold space is used as a boolean: the current line will be output if and only if the hold space is empty.
sed -e '/^TRACE/ h ; /^\(DEBUG\|ERROR\)/ { x ; s/.*// ; x } ; x ; /./ { x ; d } ; x'
It works as follows:
^TRACE
, then it is put in the hold space so that output will be disabled (see below).^\(DEBUG\|ERROR\)
, then the hold space is cleared so that the output will be enabled (see below).x ; /./ { x ; d } ; x
deletes the current pattern space if and only if the hold space contains something. Since -n
is not used, the pattern space will be output if it has not been deleted. As wanted, the hold space is preserved (for the other lines of the blocks): exactly 2 x
are executed (because d
starts a new cycle).Notes:
^\(TRACE\|DEBUG\|ERROR\)
(only the TRACE
blocks will be removed).EDIT: In more complex cases, one may need to swap the first two tests. For instance, with the file
Checking test1
some line 1a
some line 1b
Checking test2
Checking test3
some line 3a
some line 3b
Checking test4
some line 4
Checking test5
if one wants to remove the output corresponding to test3, one should use
sed -e '/^Checking/ { x ; s/.*// ; x } ; /test3/ h ; x ; /./ { x ; d } ; x'
Upvotes: 0
Reputation: 36229
You could duplicate your tags, to remove just the first of them:
sed -E "s/^((DEBUG)|(ERROR)) : /\1 : \n\1 /" | \
sed "/^TRACE/,/^DEBUG\|^ERROR/d" | sed "s/^</\t</"
DEBUG :
DEBUG <line1>
<line2>
DEBUG <line21>
<line22>
<line23>
ERROR <line41>
<line42>
DEBUG <line61>
<line62>
The last sed-command is just for a better readability, and the first line is left as an exercise. :)
Upvotes: 0
Reputation: 4866
Here is a way to do what you want in sed, although this is a situation where I would normally use perl. This uses sed's "hold space" to collect each section of the log file, and prints (or not) the whole section once it sees the start of the next section.
sed -n -e '/^\(TRACE\|DEBUG\|ERROR\)/ ! { H ; $!b } ; x ; /^\(DEBUG\|ERROR\)/ p'
However, responding to the subject of the question, I don't think it's possible to exclude the last row from the range.
Upvotes: 2
Reputation: 212248
This provides a reasonable technique for splitting the input to different places. It would be nice to use a case, but if you insist on anchoring the string the the beginning of the line, I don't believe that is possible.
#!/bin/sh exec 3>&1 exec 4> /dev/null exec 5>&1 while read -r line; do echo $line | grep ^DEBUG >&3 && exec >&3 && continue echo $line | grep ^TRACE >&4 && exec >&4 && continue echo $line | grep ^ERROR >&5 && exec >&5 && continue echo $line done
Upvotes: 0
Reputation: 51603
awk '/^TRACE/ {
while ( $0 !~ /^DEBUG/ || $0 !~ /^ERROR/ ) {
getline ;
if ( $0 ~ /^DEBUG/ || $0 ~ /^ERROR/ ) {
print $0 ;
next
}
}
}
{ print $0 }' FILENAME
AWK to the rescue ;-) (Note: it can be pasted to one line.)
Upvotes: 0