slybloty
slybloty

Reputation: 6516

Issues with regex when searching pattern on two lines

I know this type of search has been address in a few other questions here, but for some reason I can not get it to work in my scenario.
I have a text file that contains something similar to the following patter:

some text here done
12345678_123456 226-
more text
some more text here done
12345678_234567 226-

I'm trying to find all cases where done is followed by 226- on the next line, with the 16 characters proceeding. I tried grep -Pzo and pcregrep -M but all return nothing.

I attempted multiple combinations of regex to take in account the 2 lines and the 16 chars in between. This is one of the examples I tried with grep:

grep -Pzo '(?s)done\n.\{16\}226-' filename

Related posts:

Upvotes: 1

Views: 57

Answers (2)

user557597
user557597

Reputation:

Generalize it to this (?m)done$\s+.*226-$

Because requiring a \n after 226- at end of string is a bad thing.
And not requiring a \n after 226- is also a bad thing.
Thus, the paradox is solved with (\n|$) but why the \n at all?

Both problems solved with multiline and $.

https://regex101.com/r/A33cj5/1

Upvotes: 1

anubhava
anubhava

Reputation: 785406

You must not escape { and } while using -P (PCRE) option in grep. That escaping is only for BRE.

You can use:

grep -ozP 'done\R.{16}226-\R' file

done
12345678_123456 226-
done
12345678_234567 226-

\R will match any unicode newline character. If you are only dealing with \n then you may just use:

grep -ozP 'done\n.{16}226-\n' file

Upvotes: 0

Related Questions