Reputation: 135
I have a file like this (test.txt):
abc
12
34
def
56
abc
ghi
78
def
90
And I would like to search the 78 which is enclosed by "abc\nghi" and "def". Currently, I know I can do this by:
cat test.txt | awk '/abc/,/def/' | awk '/ghi/,'/def/'
Is there any better way?
Upvotes: 1
Views: 99
Reputation: 46826
You could do this with sed. It's not ideal in that it doesn't actually understand records, but it might work for you...
sed -Ene 'H;${x;s/.*\nabc\nghi\n([0-9]+)\ndef\n.*/\1/;p;}' input.txt
Here's what's basically going on:
H
- appends the current line to sed's "hold space"${
- specifies the start of a series of commands that will be run once we come to the end of the filex
- swaps the hold space with the pattern space, so that future substitutions will work on what was stored using H
s/../../
- analyses the pattern space (which is now multi-line), capturing the data specified in your question, replacing the entire pattern space with the bracketed expression...p
- prints the result.One important factor here is that the regular expression is ERE, so the -E
option is important. If your version of sed uses some other option to enable support for ERE, then use that option instead.
Another consideration is that the regex above assumes Unix-style line endings. If you try to process a text file that was generated on DOS or Windows, the regex may need to be a little different.
Upvotes: 0
Reputation: 67467
grep
alternative
$ grep -Pazo '(?s)(?<=abc\nghi)(.*)(?=def)' file
but I think awk
will be better
Upvotes: 0
Reputation: 26471
This is not really clean, but you can redefine your record separator as a regular expression to be abc\nghi\n|\ndef
. This however creates multiple records, and you need to keep track which ones are between the correct ones. With awk you can check which RS was found using RT
.
awk 'BEGIN{RS="abc\nghi\n|\ndef"}
(RT~/abc/){s=1}
(s==1)&&(RT~/def/){print $0}
{s=0}' file
This does :
RS
to abc\nghi\n
or \ndef
.RT
contains abc
you found the first one.RT
contains def
, then print.Upvotes: 0
Reputation: 23667
One way is to use flags
$ awk '/ghi/ && p~/abc/{f=1} f; /def/{f=0} {p=$0}' test.txt
ghi
78
def
{p=$0}
this will save input line for future use/ghi/ && p~/abc/{f=1}
set flag if current line contains ghi
and previous line contains abc
f;
print input record as long as flag is set/def/{f=0}
clear the flag if line contains def
If you only want the lines between these two boundaries
$ awk '/ghi/ && p~/abc/{f=1; next} /def/{f=0} f; {p=$0}' ip.txt
78
$ awk '/12/ && p~/abc/{f=1; next} /def/{f=0} f; {p=$0}' ip.txt
34
See also How to select lines between two patterns?
Upvotes: 2
Reputation: 92854
awk solution:
awk '/ghi/ && r=="abc"{ f=1; n=NR+1 }f && NR==n{ v=$0 }v && NR==n+1{ print v }{ r=$0 }' file
The output:
78
Bonus GNU awk approach:
awk -v RS= 'match($0,/\nabc\nghi\n(.+)\ndef/,a){ print a[1] }' file
Upvotes: -1