User
User

Reputation: 24749

Pattern within a pattern?

I want to capture Alta, Utah, USA from asd Alta, Utah, USA qwe. Basically I'm trying to capture places from a text. It won't be a perfect method, but the places must start with a capital and use a comma, followed by another word with a capital.

So far, I have wrote:

\s[A-Z][a-z]+[,]?

I want to do multiple words, not just the first word, Alta. This is my attempt to use square brackets inside other square brackets.

[\s[A-Z][a-z]+[,]?]+

But that doesn't work, so it must be syntactically incorrect.

Upvotes: 0

Views: 44

Answers (3)

hurturk
hurturk

Reputation: 5444

Just joining the party:

import re
dirty = "asd Alta, Utah, USA qwe"
p = re.compile("([A-Z][a-zA-Z]+)")
re.findall(p,dirty)

output:

['Alta', 'Utah', 'USA']

Upvotes: 1

Jorge Campos
Jorge Campos

Reputation: 23381

I think this is what you need:

([A-Z][a-zA-Z]+)(,\s*([A-Z][a-zA-Z]+))*

Though the requirement pointed out by @Rizwan (in his comment) is still to be understood.

enter image description here

Debuggex Demo

Upvotes: 2

Mustofa Rizwan
Mustofa Rizwan

Reputation: 10476

Updated as per OP's comment:

(?:\s*[A-Z][A-Za-z]+[,\s])+

Demo

Original Answer:

\b([A-Z][a-zA-Z]+),?

Original Demo

And you will get the names of the country in group 1 for each match

Upvotes: 2

Related Questions