LiverToll92
LiverToll92

Reputation: 97

Regex in python: combining 2 regex expressions into one

Suppose I have the following list:

a = ['35','years','opened','7,000','churches','rev.','mr.','brandt','said','adding','denomination','national','goal','one','church','every','10,000','persons']

I want to remove all elements, that contain numbers and elements, that end with dots. So I want to delete '35','7,000','10,000','mr.','rev.'

I can do it separately using the following regex:

regex = re.compile('[a-zA-Z\.]')
regex2 = re.compile('[0-9]')

But when I try to combine them I delete either all elements or nothing. How can I combine two regex correctly?

Upvotes: 1

Views: 108

Answers (4)

tsumarios
tsumarios

Reputation: 81

This should work:

reg = re.compile('[a-zA-Z]+\.|[0-9,]+')

Note that your first regex is wrong because it deletes any string within a dot inside it. To avoid this, I included [a-zA-Z]+\. in the combined regex. Your second regex is also wrong as it misses a "+" and a ",", which I included in the above solution. Here a demo.

Also, if you assume that elements which end with a dot might contain some numbers the complete solution should be:

reg = re.compile('[a-zA-Z0-9]+\.|[0-9,]+')

Upvotes: 2

Jan
Jan

Reputation: 43169

You could use:

(?:[^\d\n]*\d)|.*\.$

See a demo on regex101.com.

Upvotes: 1

Toto
Toto

Reputation: 91385

Here is a way to do the job:

import re

a = ['35','years','opened','7,000','churches','rev.','mr.','brandt','said','adding','denomination','national','goal','one','church','every','10,000','per.sons']
b = []
for s in a:
    if not re.search(r'^(?:[\d,]+|.*\.)$', s):
        b.append(s)
print b

Output:

['years', 'opened', 'churches', 'brandt', 'said', 'adding', 'denomination', 'national', 'goal', 'one', 'church', 'every', 'per.sons']

Demo & explanation

Upvotes: 0

BadHorsie
BadHorsie

Reputation: 14544

If you don't need to capture the result, this matches any string with a dot at the end, or any with a number in it.

\.$|\d

Upvotes: 1

Related Questions