Reputation: 2189
I am trying to use grep to perform multiline search in linux but having problem with it.
Basically i want to extract all the lines that follows with Sequences
string in the below example.
Query= BRNW_157
Sequences producing significant alignments: (Bits) Value
Query= BRNW_428
Query= BRNW_503
Sequences producing significant alignments: (Bits) Value
Query= BRNW_601
Query= BRNW_617
Sequences producing significant alignments: (Bits) Value
I tried awk but it doesn't work...
awk '/Query=*/,/Sequences*/'
and then i used grep and it doesn't work either...grep -PZo 'Query=*\n.*sequences'
.
Is there a way to go around this problem?
Upvotes: 0
Views: 223
Reputation: 204638
Are you saying you want to find the word Sequences and print that line plus the line before it?
That'd just be:
awk '/Sequences/{print prev ORS $0} {prev=$0}' file
Upvotes: 1
Reputation: 23394
You are probably looking for
grep -oPz '(?ms)Query=(?:(?!Query).)*?Sequences.*?$'
This passes PCRE MULTILINE and DOTALL flags via the (?ms)
and picks out each segment from a Query
line to the next Sequences
line.
Additionally, the -z
flag passed to grep forces it to treat NUL as line-separator, ,making the contents of the file appear as a single string to it.
Upvotes: 1