user1330974
user1330974

Reputation: 2616

Regular expression to match repeated occurrence of a pattern

I have a few possible input strings like below:

Roll|N/A|300x60|(1x1)|AAA|BBB

Desktop|1x1|(1x1)|AAA|BBB

Desktop|NA|(NA)|AAA|BBB

Roll|N/A|N/A|(1x1)|AAA|BBB

from which, I'm trying to detect pattern of type \d+x\d+ (e.g., '300x60', '1x1' from the first line; '1x1', '1x1' from the second; None from the third; and '1x1' from the last). Could someone show me how to write Python regular expression search to capture none or one or many occurrence(s) of such pattern in a given string? I tried below already and it only captures either the first or the second occurrence of the pattern in a given sentence. Thank you!

r = re.search('(\(?\d+x\d+\)?)+', my_str) 
r.group() # only gives me '320x50' for the first input above

Upvotes: 1

Views: 747

Answers (2)

Quinn
Quinn

Reputation: 4504

You could do like this:

import re
input_strings = ['Roll|N/A|300x60|(1x1)|AAA|BBB', 'Desktop|1x1|(1x1)|AAA|BBB',\
                 'Desktop|NA|(NA)|AAA|BBB','Roll|N/A|N/A|(1x1)|AAA|BBB']

print [[ j if j else None for j in [re.findall('(\d+x\d+)', i)]  ][0] for i in input_strings ]

Output:

[['300x60', '1x1'], ['1x1', '1x1'], None, ['1x1']]

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627517

You can use

import re
my_strs = ["Roll|N/A|300x60|(1x1)|AAA|BBB", "Desktop|1x1|(1x1)|AAA|BBB", "Desktop|NA|(NA)|AAA|BBB", "Roll|N/A|N/A|(1x1)|AAA|BBB"]
print([re.findall(r'\d+x\d+', s) for s in my_strs])
# => [['300x60', '1x1'], ['1x1', '1x1'], [], ['1x1']]

See the IDEONE demo and the regex demo.

The main point is using the re.findall that will fetch multiple matches (or captured substrings, but there is no capturing group in the pattern I suggest). The issue you have is that you tried to match repeated captures with 1 search operation. Since the substrings are not adjoining, glued, you only had single results.

Upvotes: 1

Related Questions