Reputation: 1345
I'm trying to use the re module in a way that it will return bunch of characters until a particular string follows an individual character. The re documentation seems to indicate that I can use (?!...) to accomplish this. The example that I'm currently wrestling with:
str_to_search = 'abababsonab, etc'
first = re.search(r'(ab)+(?!son)', str_to_search)
second = re.search(r'.+(?!son)', str_to_search)
first.group() is 'abab', which is what I'm aiming for. However, second.group() returns the entire str_to_search string, despite the fact that I'm trying to make it stop at 'ababa', as the subsequent 'b' is immediately followed by 'son'. Where am I going wrong?
Upvotes: 0
Views: 76
Reputation:
This should work:
second = re.search(r'(.(?!son))+', str_to_search)
#output: 'ababa'
Upvotes: 1
Reputation: 4500
It's not the simplest thing, but you can capture a repeating sequence of "a character not followed by 'son'". This repeated expression should be in a non-capturing group, (?: ... ), so it doesn't mess with your match results. (You'd end up with an extra match group)
Try this:
import re
str_to_search = 'abababsonab, etc'
second = re.search(r'(?:.(?!son))+', str_to_search)
print(second.group())
Output:
ababa
See it here: http://ideone.com/6DhLgN
Upvotes: 2
Reputation: 1056
not sure what you are trying to do
check out string.partition
'.+?' is the minimal matcher, otherwise it is greedy and gets it all
read the docs for group(...) and groups(..) especially when passing group number
Upvotes: 0