Reputation: 24749
I want to capture Alta, Utah, USA
from asd Alta, Utah, USA qwe
. Basically I'm trying to capture places from a text. It won't be a perfect method, but the places must start with a capital and use a comma, followed by another word with a capital.
So far, I have wrote:
\s[A-Z][a-z]+[,]?
I want to do multiple words, not just the first word, Alta
. This is my attempt to use square brackets inside other square brackets.
[\s[A-Z][a-z]+[,]?]+
But that doesn't work, so it must be syntactically incorrect.
Upvotes: 0
Views: 44
Reputation: 5444
Just joining the party:
import re
dirty = "asd Alta, Utah, USA qwe"
p = re.compile("([A-Z][a-zA-Z]+)")
re.findall(p,dirty)
output:
['Alta', 'Utah', 'USA']
Upvotes: 1
Reputation: 23381
I think this is what you need:
([A-Z][a-zA-Z]+)(,\s*([A-Z][a-zA-Z]+))*
Though the requirement pointed out by @Rizwan (in his comment) is still to be understood.
Upvotes: 2
Reputation: 10476
Updated as per OP's comment:
(?:\s*[A-Z][A-Za-z]+[,\s])+
Original Answer:
\b([A-Z][a-zA-Z]+),?
And you will get the names of the country in group 1 for each match
Upvotes: 2