Reputation: 33
I am building an application, and I have a requirement to capture characters before and after matches. This seems to work okay, except when there are multiple matches within the surrounding capture.
Regex:
.{0,10}(?=abc)
This should capture up to 10 characters before the string "abc" is found.
The issue comes up if there is a recurrence of the match in the preceding text:
"qqqqabcabcqqq"
With the above text, I would expect two captures:
qqqq (the 4 characters before the first abc occurrence)
qqqqabc (the 7 characters before the second abc occurrence)
I am not, however getting these matches. The only match I get is:
qqqqabc
I am certain that I am missing something, but I am not sure what. I believe that my regex is somehow being too greedy, and so it is overlooking the first match in favor of the larger, second one. Here is what I need:
I need a regex that:
1. Is for .NET
2. Looks within a string for X characters before an exact match on string S.
3. Includes any secondary match on S (call S') that is found within X characters before S
4. does not care in the slightest what these characters are.
I assure you, I tried looking for similar answers but I wasn't able to find anything that directly answers this question (which has been plaguing me for two days. Yes, I have to use regular expression). As for Regex flavor, I am working in .NET.
Thank you so much for any help.
Upvotes: 3
Views: 738
Reputation: 408
Here it is:
(?<=(?<CharsBefore>.{0,10}))(?=abc)
Took me a while to remember that .NET allows positive lookbehinds with variability.
I changed the way your initial version worked a bit.
Hope it helps!
PS: I've named the group, but you are obviously free to keep it nameless and work with numbered groups if you want a less cluttered regex, like so:
(?<=(.{0,10}))(?=abc)
Upvotes: 2