user1850133
user1850133

Reputation: 2993

python return matching and non-matching patterns of string

I would like to split a string into parts that match a regexp pattern and parts that do not match into a list.

For example

import re
string = 'my_file_10'
pattern = r'\d+$'
#  I know the matching pattern can be obtained with :
m = re.search(pattern, string).group()
print m
'10'
#  The final result should be as following
['my_file_', '10']

Upvotes: 5

Views: 5926

Answers (2)

hwnd
hwnd

Reputation: 70722

You can use re.split to make a list of those separate matches and use filter, which filters out all elements which are considered false ( empty strings )

>>> import re
>>> filter(None, re.split(r'(\d+$)', 'my_file_015_01'))
['my_file_015_', '01']

Upvotes: 3

Martijn Pieters
Martijn Pieters

Reputation: 1121486

Put parenthesis around the pattern to make it a capturing group, then use re.split() to produce a list of matching and non-matching elements:

pattern = r'(\d+$)'
re.split(pattern, string)

Demo:

>>> import re
>>> string = 'my_file_10'
>>> pattern = r'(\d+$)'
>>> re.split(pattern, string)
['my_file_', '10', '']

Because you are splitting on digits at the end of the string, an empty string is included.

If you only ever expect one match, at the end of the string (which the $ in your pattern forces here), then just use the m.start() method to obtain an index to slice the input string:

pattern = r'\d+$'
match = re.search(pattern, string)
not_matched, matched = string[:match.start()], match.group()

This returns:

>>> pattern = r'\d+$'
>>> match = re.search(pattern, string)
>>> string[:match.start()], match.group()
('my_file_', '10')

Upvotes: 12

Related Questions