Reputation: 2616
I have a few possible input strings like below:
Roll|N/A|300x60|(1x1)|AAA|BBB
Desktop|1x1|(1x1)|AAA|BBB
Desktop|NA|(NA)|AAA|BBB
Roll|N/A|N/A|(1x1)|AAA|BBB
from which, I'm trying to detect pattern of type \d+x\d+
(e.g., '300x60', '1x1' from the first line; '1x1', '1x1' from the second; None
from the third; and '1x1' from the last). Could someone show me how to write Python regular expression search to capture none or one or many occurrence(s) of such pattern in a given string? I tried below already and it only captures either the first or the second occurrence of the pattern in a given sentence. Thank you!
r = re.search('(\(?\d+x\d+\)?)+', my_str)
r.group() # only gives me '320x50' for the first input above
Upvotes: 1
Views: 747
Reputation: 4504
You could do like this:
import re
input_strings = ['Roll|N/A|300x60|(1x1)|AAA|BBB', 'Desktop|1x1|(1x1)|AAA|BBB',\
'Desktop|NA|(NA)|AAA|BBB','Roll|N/A|N/A|(1x1)|AAA|BBB']
print [[ j if j else None for j in [re.findall('(\d+x\d+)', i)] ][0] for i in input_strings ]
Output:
[['300x60', '1x1'], ['1x1', '1x1'], None, ['1x1']]
Upvotes: 1
Reputation: 627517
You can use
import re
my_strs = ["Roll|N/A|300x60|(1x1)|AAA|BBB", "Desktop|1x1|(1x1)|AAA|BBB", "Desktop|NA|(NA)|AAA|BBB", "Roll|N/A|N/A|(1x1)|AAA|BBB"]
print([re.findall(r'\d+x\d+', s) for s in my_strs])
# => [['300x60', '1x1'], ['1x1', '1x1'], [], ['1x1']]
See the IDEONE demo and the regex demo.
The main point is using the re.findall
that will fetch multiple matches (or captured substrings, but there is no capturing group in the pattern I suggest). The issue you have is that you tried to match repeated captures with 1 search operation. Since the substrings are not adjoining, glued, you only had single results.
Upvotes: 1