blub.flows
blub.flows

Reputation: 33

Capture Text Surrounding Regex Match .NET

I am building an application, and I have a requirement to capture characters before and after matches. This seems to work okay, except when there are multiple matches within the surrounding capture.

Regex:

.{0,10}(?=abc)

This should capture up to 10 characters before the string "abc" is found.

The issue comes up if there is a recurrence of the match in the preceding text:

"qqqqabcabcqqq"

With the above text, I would expect two captures:

qqqq (the 4 characters before the first abc occurrence)
qqqqabc (the 7 characters before the second abc occurrence)

I am not, however getting these matches. The only match I get is:

qqqqabc

I am certain that I am missing something, but I am not sure what. I believe that my regex is somehow being too greedy, and so it is overlooking the first match in favor of the larger, second one. Here is what I need:

I need a regex that:

1. Is for .NET

2. Looks within a string for X characters before an exact match on string S.

3. Includes any secondary match on S (call S') that is found within X characters before S

4. does not care in the slightest what these characters are.

I assure you, I tried looking for similar answers but I wasn't able to find anything that directly answers this question (which has been plaguing me for two days. Yes, I have to use regular expression). As for Regex flavor, I am working in .NET.

Thank you so much for any help.

Upvotes: 3

Views: 738

Answers (1)

juanlu
juanlu

Reputation: 408

Here it is:

(?<=(?<CharsBefore>.{0,10}))(?=abc)

Took me a while to remember that .NET allows positive lookbehinds with variability.

Regex test

Demo in C#

I changed the way your initial version worked a bit.

Hope it helps!

PS: I've named the group, but you are obviously free to keep it nameless and work with numbered groups if you want a less cluttered regex, like so:

(?<=(.{0,10}))(?=abc)

Upvotes: 2

Related Questions