Reputation: 24160
I have a log of following format
<<
[ABC] some other data
some other data
>>
<<
DEF some other data
some other data
>>
<<
[ABC] some other data
some other data
>>
I wanted to select all logs which are having ABC expected result is
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>
What will the expression for sed command ? For fetching contents b/w << >> expression will be
sed -e '/<</,/>>/!d'
But how can I force it to have [ABC] in b/w
Upvotes: 0
Views: 1742
Reputation: 67301
This works on my side:
awk '$0~/ABC/{print "<<";print;getline;print;getline;print }' temp.txt
tested as below:
pearl.242> cat temp.txt
<<
[ABC] some other data
some other data
>>
<<
DEF some other data
some other data
>>
nkeem
<<
[ABC] some other data
some other data
>>
pearl.243> awk '$0~/ABC/{print "<<";print;getline;print;getline;print }' temp.txt
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>
pearl.244>
If you donot want to hard code this statement print "<<";
,then you can go for the below:
pearl.249> awk '$0~/ABC/{print x;print;getline;print;getline;print}{x=$0}' temp.txt
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>
pearl.250>
Upvotes: 1
Reputation: 58568
This might work for you:
sed '/^<</,/^>>/{/^<</{h;d};H;/^>>/{x;/^<<\n\[ABC\]/p}};d' file
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>
sed comes equipped with a register called the hold space
(HS).
You can use the HS to collect data of interest. In this case lines between /^<</,/^>>/
h
replaces whatever is in the HS with what is in the pattern space (PS)
H
appends a newline \n
and then the PS to the HS
x
swaps the HS for the PS
N.B. This deletes all lines other than those between <<...>>
containing [ABC]
.
If you want to retain other lines use:
sed '/^<</,/^>>/{/^<</{h;d};H;/^>>/{x;/^<<\n\[ABC\]/p};d}' file
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>
Upvotes: 2
Reputation: 58647
TXR: built for multi-line stuff.
@(collect)
<<
[ABC] @line1
@line2
>>
@ (output)
>>
[ABC] @line1
@line2
<<
@ (end)
@(end)
Run:
$ txr data.txr data
>>
[ABC] some other data
some other data
<<
>>
[ABC] some other data
some other data
<<
Very basic stuff; you're probably better off sticking to awk until you have a very complicated multi-line extraction job with irregular data with numerous cases, lots of nesting, etc.
If the log is very large, we should write @(collect :vars ())
so the collect doesn't implicitly accumulate lists; then the job will run in constant memory.
Also, if the logs are not always two lines, it becomes a little more complicated. We can use a nested collect to gather the variable number of lines.
@(collect :vars ())
<<
[ABC] @line1
@ (collect)
@line
@ (until)
>>
@ (end)
@ (output)
>>
[ABC] @line1
@ {line "\n"}
<<
@ (end)
@(end)
Upvotes: 0
Reputation: 29266
To me, sed is line based. You can probably talk it into being multi line, but it would be easier to start the job with awk or perl rather than trying to do it in sed.
I'd use perl and make a little state machine like this pseudo code (I don't guarantee it'll catch every little detail of what you are trying to achieve)
state = 0;
for each line
if state == 0
if line == '<<'
state = 1;
if state == 1
If line starts with [ABC]
buffer += line
state =2
if state == 2
if line == >>
do something with buffer
state = 0
else
buffer += line;
See also http://www.catonmat.net/blog/awk-one-liners-explained-part-three/ for some hints on how you might do it with awk as a 1 liner...
Upvotes: 0