dinan5m3
dinan5m3

Reputation: 33

Complex matching across multiple lines

been searching here and got close but seems like still not quite what i'm trying to do. eg. please consider following sample test input, the objective is to find matches that span multiple lines that start with line that contains "abc" (print this line), and ends with line that contains "efg" (also print this line), and also print the lines in between.

yyabc}
000
iiabc<
    {efg+1}
111
yyabc}
222
 p  {efg+13}
zzz
   z   {efg+243} {}
iii
oooabc>
ooo

The closest that came to meeting what i'm looking for is, with zzz as the test input file with above lines,

sed -e '/abc/,/efg/!d' zzz

, but includes extra lines, that wouldn't mind not being there,

yyabc}   <<***** extra
000      <<***** extra
iiabc<
    {efg+1}
yyabc}
222
 p  {efg+13}
oooabc>  <<***** extra
ooo      <<***** extra

, thus expected output is,

iiabc<
    {efg+1}
yyabc}
222
 p  {efg+13}

Besides relying on pcregrep (i have everything else in the linux box), is there a solution that can produce such multiple lines matching?

Thanks much.

Upvotes: 2

Views: 160

Answers (6)

potong
potong

Reputation: 58488

This might work for you (GNU sed):

sed -n '/abc/,/efg/{/abc/{h;d};H;/efg/{g;p}}' file

Used sed in "grep" mode by invoking the -n switch. Filter the lines of interest between abc and efg`. Use the hold space (HS) to store inclusive lines and then print them out.

Alternative:

sed -n '/abc/,/efg/{/abc/h;//!H;/efg/{x;p}}' file

Upvotes: 1

Miller
Miller

Reputation: 35208

Using a perl one-liner that slurps the entire file:

perl -0777 -ne 'print /.*abc.*\n(?:(?!.*(?:abc|efg)).*\n)*.*efg.*\n/g' file.txt

Or a line by line buffered solution:

perl -ne '
    $b = /abc/ ? $_ : "$b$_";
    print $b if (/abc/ .. /efg/) =~ /E/
  ' file.txt

Switches:

  • -0777: Slurp the entire file.
  • -n: Creates a while(<>){...} loop for each “line” in your input file.
  • -e: Tells perl to execute the code on command line.

Upvotes: 1

Juan Diego Godoy Robles
Juan Diego Godoy Robles

Reputation: 14965

A straightforward array based awk solution:

awk '/abc/ {delete a;j=0;flag=1}
     flag    {a[++j]=$0}
     /efg/ && flag {for (i=1;i<=j;i++){print a[i]};flag=0}' inputfile

/abc/ {delete a;j=0;flag=1} : When find initial pattern ,delete the array , set counter to zero and turn on the "find" flag.

flag {a[++j]=$0} : Store line content when flag is on.

/efg/ && flag {for (i=1;i<=j;i++){print a[i]};flag=0}: when end pattern is found and flag on , show the array and turn off flag

Upvotes: 0

John1024
John1024

Reputation: 113924

awk is well suited to this task. If you test input file is called zzz, then run:

$ awk '/abc/{a=""} /abc/,/efg/{a=a"\n"$0} /efg/{print substr(a,2);a=""}' zzz
iiabc<
    {efg+1}
yyabc}
222
 p  {efg+13}

Explanation:

  • /abc/{a=""}

    Every time that a line containing "abc" is reached, set the variable a to an empty string. (The lines that we want to print will be added to this variable in the next step.)

  • /abc/,/efg/{a=a"\n"$0}

    Over every range of lines that starts with a line containing abc and ends with a line containing efg, each line is appended to the variable a.

  • /efg/{print substr(a,2);a=""}

    When the last line in the range is reached, print out a. Because a begins with an extra newline character, we use substr to remove it.

Without the first step above, the program runs fine but the "extra" lines would be printed. With the first step included, they are eliminated.

Upvotes: 1

NeronLeVelu
NeronLeVelu

Reputation: 10039

sed -n '/abc/,/efg/ {
   H
   /efg/ {
      g
:a
      s/^.*\n\(.*abc\)/\1/
      ta
      p
      }
   }' zzz

Use of the buffer to catch the part between abc and first efg, than remove any line before the last abc line, finally print the result and continue to rest of text.

Does not work if abc is on the same line as efg with no previous abc from "same" part of text because sed //,// work from patterne on one line until pattern on ANOTHER line

Upvotes: 1

vks
vks

Reputation: 67988

(.*?abc(?:(?:(?!efg|abc).)|\n)*efg.*$)

Try this through perl.

See demo.

http://regex101.com/r/bA0jG5/11

Upvotes: 0

Related Questions