Reputation: 37227
For example, I define "keyword foo near start of string" as this regex:
(?<=^.{,10})foo
And I define "short string" as this regex (or equivalently, 30 chars or less):
^(?=.{,30}$)
Now the question is, I want to match "keyword foo near start of a short string" with one single regex, but am unsure how to do it. The matched text must be "foo" so surrounding text are supposed to be handled properly (with lookarounds).
This one is what I've tried and it obviously doesn't work:
^(?=.{0,30)$)(?<=^.{,10})foo
This one works but matches too much text, I only want foo, not aafoo:
^(?=.{0,30)$).{,10}foo
Expected inputs and outputs:
aaaaaaaaaaa => None
aafooaaaaaa => "foo" (at position 2-5)
aaaaaaaaaaafoo => None (Too far from start of string)
aafooaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa => None (String too long)
I'm using a 3rd-party PyPI package regex
(not the built-in re
) on Python 3.
Upvotes: 4
Views: 423
Reputation: 370679
Inside the lookbehind, when you match the start of the string with ^
, use lookahead to ensure that the end of the string is less than 30 characters away, so as not to consume any characters in the lookbehind - then you can consume up to 10 characters in the lookbehind to get to foo
. You can use the pattern
(?<=^(?=.{0,30}$).{,10})foo
See:
pattern = r'(?<=^(?=.{0,30}$).{,10})foo'
# matches
print(regex.search(pattern, 'text foo text'))
# fails, foo is more than 10 characters away from the start of the string:
print(regex.search(pattern, 'text text text foo text'))
# fails, string is more than 30 characters long:
print(regex.search(pattern, 'text foo text long long string long long string long long string long long string'))
Output:
<regex.Match object; span=(5, 8), match='foo'>
None
None
Upvotes: 3