Reputation: 10249
I have a string:
test2 = "-beginning realization 5 -singlespace -multispaceafter not-this-one -whitespace\t\n -end"
I want a to find all of the substrings that begin with the minus sign (-).
I can find all "but" the last occurrence:
re.findall(ur"\B-(.*?)\s", test2)
returns [u'beginning', u'singlespace', u'multispaceafter', u'whitespace']
I can find "the last occurrence":
re.findall(ur"\B-(.*?)\Z", test2)
returns [u'end']
However, I want a regex that returns
[u'beginning', u'singlespace', u'multispaceafter', u'whitespace', u'end']
Upvotes: 0
Views: 136
Reputation:
The end doesn't match because you force a whitespace in the regex.
Try:
# (?:^|\s)-(.*?)(?=\s|$)
(?: ^ | \s )
-
( .*? )
(?= \s | $ )
Upvotes: 1
Reputation: 67968
(?<=\s)-(.*?)(?=\s|$)|(?<=^)-(.*?)(?=\s|$)
Try this.See demo.
http://regex101.com/r/cN7qZ7/6
Upvotes: 1
Reputation: 70732
You can use a non-capturing group to assert that either whitespace or the end of the string follows.
>>> re.findall(r'\B-(.*?)(?:\s|$)', test2)
Although, instead of \B
and the non-capturing group I recommend the following:
>>> re.findall(r'(?<!\S)-(\S+)', test2)
Upvotes: 3
Reputation: 174716
You could try the below code also,
>>> test2 = "-beginning realization 5 -singlespace -multispaceafter not-this-one -whitespace\t\n -end"
>>> m = re.findall(r'(?:\s|^)-(\S+)', test2)
>>> m
['beginning', 'singlespace', 'multispaceafter', 'whitespace', 'end']
Upvotes: 2