Talha Ashraf
Talha Ashraf

Reputation: 1255

regex capture everything except a word (break at a word not character)

Inputs

a = "Miami, FL"
b = "Boston, MA or Miami, FL"
c = "United Kingdom"

RegEx

loc = re.compile('([^or]+)[,]*[\s]*([A-Z]+)')
locs = loc.findall(b)

How can I make it break at or? It will break at o I know. And [^(or)] and [^\(or\)] won't work either.

Upvotes: 1

Views: 94

Answers (2)

falsetru
falsetru

Reputation: 369324

It seems like you want split string by or. Use re.split:

>>> b = "Boston, MA or Miami, FL"
>>> re.split(r'\bor\b', b)
['Boston, MA ', ' Miami, FL']

>>> re.findall(r'(?:^|or)\s*([^,]+,?\s[a-z]+)', a, flags=re.I)
['Miami, FL']
>>> re.findall(r'(?:^|or)\s*([^,]+,?\s[a-z]+)', b, flags=re.I)
['Boston, MA', 'Miami, FL']
>>> re.findall(r'(?:^|or)\s*([^,]+,?\s[a-z]+)', c, flags=re.I)
['United Kingdom']

Upvotes: 3

Sabuj Hassan
Sabuj Hassan

Reputation: 39395

This should work for you:

loc = re.compile('(?:^|or)\s*([^,]+),\s([A-Z]+)')

Upvotes: 1

Related Questions