vathymut
vathymut

Reputation: 1047

Pythonic way to concatenate regex objects

I have python regex objects - say, re_first and re_second - I would like to concatenate.

import re
FLAGS_TO_USE = re.VERBOSE | re.IGNORECASE
re_first = re.compile( r"""abc #Some comments here """, FLAGS_TO_USE )
re_second = re.compile( r"""def #More comments here """, FLAGS_TO_USE )

I want one regex expression that matches either one of the above regex expressions. So far, I have

pattern_combined = re_first.pattern + '|' + re_second.pattern
re_combined = re.compile( pattern_combined, FLAGS_TO_USE ) 

This doesn't scale very well the more python objects. I end up with something looking like:

pattern_combined = '|'.join( [ first.pattern, second.pattern, third.pattern, etc ] )

The point is that the list to concatenate can be very long. Any ideas how to avoid this mess? Thanks in advance.

Upvotes: 15

Views: 25034

Answers (3)

Edgar Manukyan
Edgar Manukyan

Reputation: 1301

One can also directly concatenate r strings, for example:

prep_re = r"\b" + r"\b|\b".join(prepositions) + r"\b"
re.findall(prep_re, paragraph, re.IGNORECASE)

Upvotes: 1

Oscar Mederos
Oscar Mederos

Reputation: 29863

I don't think you will find a solution that doesn't involve creating a list with the regex objects first. I would do it this way:

# create patterns here...
re_first = re.compile(...)
re_second = re.compile(...)
re_third = re.compile(...)

# create a list with them
regexes = [re_first, re_second, re_third]

# create the combined one
pattern_combined = '|'.join(x.pattern for x in regexes)

Of course, you can also do the opposite: Combine the patterns and then compile, like this:

pattern1 = r'pattern-1'
pattern2 = r'pattern-2'
pattern3 = r'pattern-3'

patterns = [pattern1, pattern2, pattern3]

compiled_combined = re.compile('|'.join(x for x in patterns), FLAGS_TO_USE)

Upvotes: 20

planestepper
planestepper

Reputation: 3317

Toss them on a list, and then

'|'.join(your_list)

Upvotes: 5

Related Questions