Regular expression to match repeated occurrence of a pattern

Question

I have a few possible input strings like below:

Roll|N/A|300x60|(1x1)|AAA|BBB

Desktop|1x1|(1x1)|AAA|BBB

Desktop|NA|(NA)|AAA|BBB

Roll|N/A|N/A|(1x1)|AAA|BBB

from which, I'm trying to detect pattern of type \d+x\d+ (e.g., '300x60', '1x1' from the first line; '1x1', '1x1' from the second; None from the third; and '1x1' from the last). Could someone show me how to write Python regular expression search to capture none or one or many occurrence(s) of such pattern in a given string? I tried below already and it only captures either the first or the second occurrence of the pattern in a given sentence. Thank you!

r = re.search('($?\d+x\d+$?)+', my_str) 
r.group() # only gives me '320x50' for the first input above

Wiktor Stribiżew · Accepted Answer

You can use

import re
my_strs = ["Roll|N/A|300x60|(1x1)|AAA|BBB", "Desktop|1x1|(1x1)|AAA|BBB", "Desktop|NA|(NA)|AAA|BBB", "Roll|N/A|N/A|(1x1)|AAA|BBB"]
print([re.findall(r'\d+x\d+', s) for s in my_strs])
# => [['300x60', '1x1'], ['1x1', '1x1'], [], ['1x1']]

See the IDEONE demo and the regex demo.

The main point is using the re.findall that will fetch multiple matches (or captured substrings, but there is no capturing group in the pattern I suggest). The issue you have is that you tried to match repeated captures with 1 search operation. Since the substrings are not adjoining, glued, you only had single results.

Regular expression to match repeated occurrence of a pattern

Answers (2)

Related Questions