carter
carter

Reputation: 5432

regex command line linux - select all lines between two strings

I have a text file with contents like this:

here is some super text:
  this is text that should
  be selected with a cool match
And this is how it all ends
blah blah...

I am trying to get the two lines (but could be more or less lines) between:

some super text:

and

And this is how

I am using grep on an ubuntu machine and a lot of the patterns I've found seem to be specific to different kinds of regex engines.

So I should end up with something like this:

grep "my regex goes here" myFileNameHere

Not sure if egrep is needed, but could use that just as easy.

Upvotes: 1

Views: 1359

Answers (4)

choroba
choroba

Reputation: 241758

You can use addresses in sed:

sed -e '/some super text/,/And this is how/!d' file

!d means "don't output if not in the range".

To exclude the border lines, you must be more clever:

sed -n -e '/some super text/ {n;b c}; d;:c {/And this is how/ {d};p;n;b c}' file

Or, similarly, in Perl:

perl -ne 'print if /some super text/ .. /And this is how/' file

To exclude the border lines again, change it to

perl -ne '$in = /some super text/ .. /And this is how/; print if $in > 1 and $in !~ /E/' file

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174696

Give a try to pcregrep instead of normal grep. Because normal grep won't help you to fetch multiple lines in a row.

$ pcregrep -M -o '(?s)some super text:[^\n]*\n\K.*?(?=\n[^\n]*And this is how)' file
  this is text that should
  be selected with a cool match
  • (?s) Dotall modifier allows dot to match even newline characters also.
  • \K Discards the previously matched characters.

From pcregrep --help

-M, --multiline              run in multiline mode
-o, --only-matching=n        show only the part of the line that matched

Upvotes: 1

Todd A. Jacobs
Todd A. Jacobs

Reputation: 84343

TL;DR

With your corpus, another way to solve the problem is by matching lines with leading whitespace, rather than using a flip-flop operator of some sort to match start and end lines. The following solutions work with your posted example.

GNU Grep with PCRE Compiled In

$ grep -Po '^\s+\K.*' /tmp/corpus 
this is text that should
be selected with a cool match

Alternative: Use pcregrep Instead

$ pcregrep -o '^\s+\K.*' /tmp/corpus 
this is text that should
be selected with a cool match

Upvotes: 0

ooga
ooga

Reputation: 15501

I don't see how it could be done in grep. Using awk:

awk '/^And this is how/ {p=0}; p; /some super text:$/ {p=1}' file

Upvotes: 1

Related Questions