Reputation: 261

How to exclude last row in SED range

I have log file that contains multiple rows for single log output like this:

DEBUG : <line1> <line2> TRACE : <line11> <line12> <line13> DEBUG : <line21> <line22> <line23> TRACE : <line31> <line32> ERROR : <line41> <line42> TRACE : <line51> <line52> DEBUG : <line61> <line62>

I have to clean it from TRACE output.

I use

sed -e "/^TRACE/,/^DEBUG\|^ERROR/d" <log.txt

... and get

DEBUG : <line1> <line2> <line22> <line23> <line42> <line62>

Sed removes range inclusively and doesn't print DEBUG and ERROR line just after TRACE block. I tried other way with sed, but didn't find how to remove TRACE blocks only.

Sed is pretty good but maybe I should use another Unix utility... Please, advise.

Upvotes: 0

Answers (5)

vinc17

Reputation: 3466

Here's a solution where sed's hold space is used as a boolean: the current line will be output if and only if the hold space is empty.

sed -e '/^TRACE/ h ; /^\(DEBUG\|ERROR\)/ { x ; s/.*// ; x } ; x ; /./ { x ; d } ; x'

It works as follows:

If the current line matches ^TRACE, then it is put in the hold space so that output will be disabled (see below).
If the current line matches ^\(DEBUG\|ERROR\), then the hold space is cleared so that the output will be enabled (see below).
The x ; /./ { x ; d } ; x deletes the current pattern space if and only if the hold space contains something. Since -n is not used, the pattern space will be output if it has not been deleted. As wanted, the hold space is preserved (for the other lines of the blocks): exactly 2 x are executed (because d starts a new cycle).

Notes:

This solution works even if the first line does not match ^\(TRACE\|DEBUG\|ERROR\) (only the TRACE blocks will be removed).
Lines will not be accumulated in the pattern space or the hold space. So this solution is memory efficient and avoids potential limitations allowed by POSIX: "The pattern and hold spaces shall each be able to hold at least 8192 bytes." (Text files with huge lines might still be affected by a limitation, but nothing can done with sed in such a case anyway.)

EDIT: In more complex cases, one may need to swap the first two tests. For instance, with the file

Checking test1
some line 1a
some line 1b
Checking test2
Checking test3
some line 3a
some line 3b
Checking test4
some line 4
Checking test5

if one wants to remove the output corresponding to test3, one should use

sed -e '/^Checking/ { x ; s/.*// ; x } ; /test3/ h ; x ; /./ { x ; d } ; x'

Upvotes: 0

user unknown

Reputation: 36229

You could duplicate your tags, to remove just the first of them:

sed -E "s/^((DEBUG)|(ERROR)) : /\1 : \n\1 /" | \
sed "/^TRACE/,/^DEBUG\|^ERROR/d" | sed "s/^</\t</"

DEBUG : 
DEBUG <line1>
      <line2>
DEBUG <line21>
      <line22>
      <line23>
ERROR <line41>
      <line42>
DEBUG <line61>
      <line62>

The last sed-command is just for a better readability, and the first line is left as an exercise. :)

Upvotes: 0

Andy

Reputation: 4866

Here is a way to do what you want in sed, although this is a situation where I would normally use perl. This uses sed's "hold space" to collect each section of the log file, and prints (or not) the whole section once it sees the start of the next section.

sed -n -e '/^\(TRACE\|DEBUG\|ERROR\)/ ! { H ; $!b } ; x ; /^\(DEBUG\|ERROR\)/ p'

However, responding to the subject of the question, I don't think it's possible to exclude the last row from the range.

Upvotes: 2

William Pursell

Reputation: 212248

This provides a reasonable technique for splitting the input to different places. It would be nice to use a case, but if you insist on anchoring the string the the beginning of the line, I don't believe that is possible.

#!/bin/sh

exec 3>&1
exec 4> /dev/null
exec 5>&1
while read -r line; do
echo $line | grep ^DEBUG >&3 && exec >&3 && continue
echo $line | grep ^TRACE >&4 && exec >&4 && continue
echo $line | grep ^ERROR >&5 && exec >&5 && continue
echo $line
done

Upvotes: 0

Zsolt Botykai

Reputation: 51603

awk '/^TRACE/ { 
                while ( $0 !~ /^DEBUG/ || $0 !~ /^ERROR/ ) { 
                    getline ; 
                    if ( $0 ~ /^DEBUG/ || $0 ~ /^ERROR/ ) { 
                        print $0 ; 
                        next 
                    } 
                } 
              } 
     { print $0 }' FILENAME

AWK to the rescue ;-) (Note: it can be pasted to one line.)

Upvotes: 0

How to exclude last row in SED range

Answers (5)

Related Questions