Petr Petrov
Petr Petrov

Reputation: 4442

Regex: find a lot of patterns in one string

I have a string

деревня Лесное, деревня Пальмово, село Поляково, город Стерлитамак

Desire output

['деревня Лесное', 'деревня Пальмово', 'село Поляково']

I try to use

villages_compiler = re.compile(r"""\b^(?:[Дд]еревня|[Сс]ело|[Рр]азъезд|[ДдСсПпХх]|[Сс]т|[Дд]ер|[Пп]ос([её]лок|[Кк]оллективный сад)?|[Пп]гт|[Рр]\.?\s?[Пп]|[Сc]адовое товарищество|ДНП|ДНТ|ДПК|ДТ|ЖК|СТ|СНТ|СПК|СО|СК)(?:\.|\s|\.\s)(?:\«?|\"?)[\w\s\.-]+(?:\»?|\"?)""" \
                               r"""|\b^[\w\s-]+(?:[Сс]ельсовет|[Шш]оссе)""")
re.findall(villages_compiler, "деревня Лесное, деревня Пальмово, село Поляково, город Стерлитамак")

But it returns an empty list. When I change findall() to search() I get only деревня Лесное

How can I fix that problem?

Upvotes: 0

Views: 72

Answers (2)

Chillie
Chillie

Reputation: 1485

Edit2:

Make sure you removed both ^s from and that you changed the weird group mentioned previously to non-capturing.

s = 'деревня Лесное, деревня Пальмово, село Поляково, город Стерлитамак'
expr = r'\b(?:[Дд]еревня|[Сс]ело|[Рр]азъезд|[ДдСсПпХх]|[Сс]т|[Дд]ер|[Пп]ос(?:[её]лок|[Кк]оллективный сад)?|[Пп]гт|[Рр]\.?\s?[Пп]|[Сc]адовое товарищество|ДНП|ДНТ|ДПК|ДТ|ЖК|СТ|СНТ|СПК|СО|СК)(?:\.|\s|\.\s)(?:\«?|\"?)[\w\s\.-]+(?:\»?|\"?)|\b[\w\s-]+(?:[Сс]ельсовет|[Шш]оссе)'

re.findall(expr, s)

Gives me the following output in python 3.6:

['деревня Лесное', 'деревня Пальмово', 'село Поляково']

Same with

comp = re.compile(expr)
comp.findall(s)

Please make sure that you are running this in python 3+ and that you don't have any typos in your regex.

Edit:

  1. As stated previously you need to get rid of the ^s in your pattern.
  2. You made this [Пп]ос([её]лок|[Кк]оллективный сад)? a capturing group and the group bracket placement seems weird too.

I ended up with this pattern (keeping the weird group but making it non-capturing). Let me know if it works.

Original post: You have a ^ beginning of string character and pass only one string.

If you remove it from both places, do you get your desired output?

Regex101 fiddle

Also, as per docs search only looks for the first location of the pattern.

Upvotes: 1

U13-Forward
U13-Forward

Reputation: 71580

As @nhahtdh said there is a method findall of villages_compiler, so:

villages_compiler.findall(your_string)

Upvotes: 0

Related Questions