Reputation: 7625
I would like to extract substring from every item of a list. The substring has to be placed after 'opt_'
prefix and no '_join'
suffix can be present in a string.
My input:
my_opts = [
'opt_tw',
'opt_ls_join',
'opt_ac_join',
'opt_pan_join',
'opt_full_led',
]
Desired output:
['tw', 'full_led']
What I have tried:
>>> import re
>>> pattern = r'opt_?(.*)[^_join]'
>>> print([
... re.search(pattern, opt).group(1)
... for opt in my_opts
... if re.match(pattern, opt)
... ])
['t', 'l', 'a', 'p', 'full_le']
Can you help me, please?
Upvotes: 1
Views: 1857
Reputation: 163342
You can match _opt
and optionally match until the last occurrence of _
.
Then assert not join
at the end of the string, and capture the rest in group 1.
opt_((?:.*_)?(?!join$)[^\r\n_]+)$
opt_
Match literally(
Capture group 1
(?:.*_)?
Optionally match until the last occurrence of _
(?!join$)
Negative lookahead, assert not join at the end of the string[^\r\n_]+
Match 1+ times any char except _
(or newlines))
Close group 1$
End of stringimport re
my_opts = [
'opt_tw',
'opt_ls_join',
'opt_ac_join',
'opt_pan_join',
'opt_full_led',
]
pattern = r"opt_((?:.*_)?(?!join$)[^\r\n_]+)$"
for s in my_opts:
match = re.match(pattern, s)
if match:
print(match.group(1))
Output
tw
full_led
If the string should not contain _join
you can use a negative lookahead
^opt_(?!.*_join)(.+)
Upvotes: 3
Reputation: 18306
You can use str.startswith
and str.endswith
for the conditions and string slicing for the capturing group:
out = [opt[4:]
for opt in my_opts
if opt.startswith("opt_") and not opt.endswith("_join")]
where 4 is equal to the length of "opt_"
and helps get substring after that,
to get
>>> out
["tw", "full_led"]
Upvotes: 2