Sebastian Zeki
Sebastian Zeki

Reputation: 6874

regex to match the first lookbehind only

I have the following reports. Some have this:

Symptom Correlation to Reflux
Table 
Symptom Correlation to Reflux
Table 
Reflux Symptom Index 
Table 

and some have this:

Symptom Correlation to Reflux
Table 
Reflux Symptom Index 
Table 

I want to only ever capture the Table between Symptom Correlation to Reflux and Reflux Symptom Index.

How can I do a positive lookbehind and only match to the first Symptom Correlation to Reflux and capture the table within that match- I guess with a non greedy operator for the positive lookbehind

Is it something like (which doesn't work):

.*?(?<=Reflux Symptom Index)Symptom Correlation to Reflux

Upvotes: 1

Views: 2270

Answers (3)

Quinn
Quinn

Reputation: 4504

Please try this pattern:

/(?<=\bSymptom Correlation to Reflux\b\n)(\S+)\s*(?=\bReflux Symptom Index\b)/g

REGEX 101 DEMO.

Upvotes: 0

collapsar
collapsar

Reputation: 17238

You may apply this regex:

/(?<=\bSymptom Correlation to Reflux\b).*(?=\bReflux Symptom Index\b)/s

It matches between the first occurrence of Symptom Correlation to Reflux until the first occurrence of Reflux Symptom Index. Pay attention to the s matching parameter which has . match newlines (not the default).

Upvotes: 1

anubhava
anubhava

Reputation: 784888

In Java you can use this regex with negative lookahead:

(?s)\bSymptom Correlation to Reflux\b((?:(?!Symptom Correlation to Reflux).)*?)\bReflux Symptom Index\b

Java code:

Pattern p = Pattern.compile(
"(?s)\\bSymptom Correlation to Reflux\\b((?:(?!Symptom Correlation to Reflux).)*?)\\bReflux Symptom Index\\b");

table is available in captured group #1

(?:(?!Symptom Correlation to Reflux).)*? is negative lookahead assertion to ensure that we don't match another Symptom Correlation to Reflux in the middle of start/end.

RegEx Demo

Upvotes: 1

Related Questions