Reputation: 133
I am trying to capture List[int]
(list of integers which might be seperated by a comma) in a string. However I am not getting the expected result.
>>> txt = '''Automatic face localisation is the prerequisite step of
facial image analysis for many applications such as facial attribute
(e.g. expression [64] and age [38]) and facial identity
recognition [45, 31, 55, 11]. A narrow definition of face localisation
may refer to traditional face detection [53, 62], '''
output
>>> re.findall(r'[(\b\d{1,3}\b,)+]',txt)
['(', '6', '4', '3', '8', ')', '4', '5', ',', '3', '1', ',', '5', '5', ',', '1', '1', '5', '3', ',', '6', '2', ',']
What should be the expression to capture the below output.
Expected output:
['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']
Upvotes: 2
Views: 505
Reputation: 163362
You can match 1-3 digits. Then repeat 0+ times matching a comma, 0+ spaces and again 1-3 digits.
\[\d{1,3}(?:, *\d{1,3})*]
\[
Match {
\d{1,3}
Match 1-3 digits(?:
Non capture group
, *\d{1,3}
)*
Close the group and repeat it 0+ times]
Match ]
Example
import re
txt = '''Automatic face localisation is the prerequisite step of facial image analysis for many applications such as facial attribute (e.g. expression [64] and age [38]) and facial identity
... recognition [45, 31, 55, 11]. A narrow definition of face localisation may refer to traditional face detection [53, 62],
... '''
print (re.findall(r'\[\d{1,3}(?:, *\d{1,3})*]',txt))
Output
['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']
If there can be more digits and spaces on all sides, including continuing the sequence on a newline:
\[\s*\d+(?:\s*,\s*\d+)*\s*]
Upvotes: 2
Reputation:
You may try:
\[[\d, ]*?]
Explanation of the above regex:
Please find the demo of the above regex in here.
Sample Implementation in python
import re
regex = r"\[[\d, ]*?]"
test_str = ("Automatic face localisation is the prerequisite step of facial image analysis for many applications such as facial attribute (e.g. expression [64] and age [38]) and facial identity\n"
"... recognition [45, 31, 55, 11]. A narrow definition of face localisation may refer to traditional face detection [53, 62]")
print(re.findall(regex, test_str))
# Outputs: ['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']
You can find the sample run of the above code in here.
Upvotes: 2