Reputation: 45
import re
def extraction(parentTag):
should_retain = True
for imageTag in parentTag:
if re.search("^(\d+.+\d)",imageTag) and not re.search("^(\d+.+\d[-^]\w)",imageTag) and not re.search("^(\d+.+\d[-^]\d)",imageTag):
should_retain = False
break
if should_retain:
return parentTag
return None
expected_input = [
['419adf7', '1.0.22-SNAPSSHOT'],
['1.0.24', '82e13c1', 'master'],
['1.0.25-1618314650'],
['1.0.10', '7ad4886'],
['1.0.13-1589279873', 'e597811'],
['73a3788'],
]
expected_input = list(filter(None,list(map(extraction, expected_input))))
print(expected_input)
Current Output = [['1.0.25-1618314650'], ['1.0.13-1589279873', 'e597811']]
Expected Output = [['1.0.25-1618314650'], ['1.0.13-1589279873', 'e597811'], ['419adf7', '1.0.22-SNAPSSHOT'], ['73a3788']]
And also is there any better way to write the code to get the Expected Output using regex.
Upvotes: 1
Views: 48
Reputation: 626747
You can use
import re
rx1 = re.compile(r'^\d+\.\d+\.\d+-\w')
rx2 = re.compile(r'^\d+\.\d+\.\d+$')
def extraction(parentTag):
return [x for x in parentTag if any(rx1.match(e) for e in x) or not any(rx2.match(e) for e in x)]
expected_input = [
['419adf7', '1.0.22-SNAPSSHOT'],
['1.0.24', '82e13c1', 'master'],
['1.0.25-1618314650'],
['1.0.10', '7ad4886'],
['1.0.13-1589279873', 'e597811'],
['73a3788'],
]
expected_input = extraction(expected_input)
print(expected_input)
Output:
[['419adf7', '1.0.22-SNAPSSHOT'], ['1.0.25-1618314650'], ['1.0.13-1589279873', 'e597811'], ['73a3788']]
See the Python demo.
NOTE:
^\d+\.\d+\.\d+-\w
(see any(rx1.match(e) for e in x)
) or there must be no item that matches ^\d+\.\d+\.\d+$
pattern (see any(rx2.match(e) for e in x)
).map(extraction, expected_input)
. You need to process the list of lists as an argument toextraction
function.Upvotes: 1
Reputation: 438
Concerning the last question: the refactoring of how complex your definition extraction function is. Here's an enhancement:
def extraction(parentTag):
should_retain = not any(
re.search("^(\d+.+\d)", imageTag)
and not re.search("^(\d+.+\d[-^]\w)", imageTag)
and not re.search("^(\d+.+\d[-^]\d)", imageTag)
for imageTag in parentTag
)
if should_retain:
return parentTag
return None
Upvotes: 0