Reputation: 33
been searching here and got close but seems like still not quite what i'm trying to do. eg. please consider following sample test input, the objective is to find matches that span multiple lines that start with line that contains "abc" (print this line), and ends with line that contains "efg" (also print this line), and also print the lines in between.
yyabc}
000
iiabc<
{efg+1}
111
yyabc}
222
p {efg+13}
zzz
z {efg+243} {}
iii
oooabc>
ooo
The closest that came to meeting what i'm looking for is, with zzz as the test input file with above lines,
sed -e '/abc/,/efg/!d' zzz
, but includes extra lines, that wouldn't mind not being there,
yyabc} <<***** extra
000 <<***** extra
iiabc<
{efg+1}
yyabc}
222
p {efg+13}
oooabc> <<***** extra
ooo <<***** extra
, thus expected output is,
iiabc<
{efg+1}
yyabc}
222
p {efg+13}
Besides relying on pcregrep (i have everything else in the linux box), is there a solution that can produce such multiple lines matching?
Thanks much.
Upvotes: 2
Views: 160
Reputation: 58488
This might work for you (GNU sed):
sed -n '/abc/,/efg/{/abc/{h;d};H;/efg/{g;p}}' file
Used sed in "grep" mode by invoking the -n
switch. Filter the lines of interest between abc
and efg`. Use the hold space (HS) to store inclusive lines and then print them out.
Alternative:
sed -n '/abc/,/efg/{/abc/h;//!H;/efg/{x;p}}' file
Upvotes: 1
Reputation: 35208
Using a perl one-liner that slurps the entire file:
perl -0777 -ne 'print /.*abc.*\n(?:(?!.*(?:abc|efg)).*\n)*.*efg.*\n/g' file.txt
Or a line by line buffered solution:
perl -ne '
$b = /abc/ ? $_ : "$b$_";
print $b if (/abc/ .. /efg/) =~ /E/
' file.txt
Switches:
-0777
: Slurp the entire file.-n
: Creates a while(<>){...}
loop for each “line” in your input file. -e
: Tells perl
to execute the code on command line. Upvotes: 1
Reputation: 14965
A straightforward array based awk solution:
awk '/abc/ {delete a;j=0;flag=1}
flag {a[++j]=$0}
/efg/ && flag {for (i=1;i<=j;i++){print a[i]};flag=0}' inputfile
/abc/ {delete a;j=0;flag=1}
: When find initial pattern ,delete the array , set counter to zero and turn on the "find" flag.
flag {a[++j]=$0}
: Store line content when flag is on.
/efg/ && flag {for (i=1;i<=j;i++){print a[i]};flag=0}
: when end pattern is found and flag on , show the array and turn off flag
Upvotes: 0
Reputation: 113924
awk
is well suited to this task. If you test input file is called zzz
, then run:
$ awk '/abc/{a=""} /abc/,/efg/{a=a"\n"$0} /efg/{print substr(a,2);a=""}' zzz
iiabc<
{efg+1}
yyabc}
222
p {efg+13}
Explanation:
/abc/{a=""}
Every time that a line containing "abc" is reached, set the variable a
to an empty string. (The lines that we want to print will be added to this variable in the next step.)
/abc/,/efg/{a=a"\n"$0}
Over every range of lines that starts with a line containing abc
and ends with a line containing efg
, each line is appended to the variable a
.
/efg/{print substr(a,2);a=""}
When the last line in the range is reached, print out a
. Because a
begins with an extra newline character, we use substr
to remove it.
Without the first step above, the program runs fine but the "extra" lines would be printed. With the first step included, they are eliminated.
Upvotes: 1
Reputation: 10039
sed -n '/abc/,/efg/ {
H
/efg/ {
g
:a
s/^.*\n\(.*abc\)/\1/
ta
p
}
}' zzz
Use of the buffer to catch the part between abc and first efg, than remove any line before the last abc line, finally print the result and continue to rest of text.
Does not work if abc is on the same line as efg with no previous abc from "same" part of text because sed //,//
work from patterne on one line until pattern on ANOTHER line
Upvotes: 1
Reputation: 67988
(.*?abc(?:(?:(?!efg|abc).)|\n)*efg.*$)
Try this through perl.
See demo.
http://regex101.com/r/bA0jG5/11
Upvotes: 0