Reputation: 661
I'm having trouble implementing a regex pattern in python. My expression works in on https://regexr.com/ but I can't get it to work in python.
Here is my expression: [abcd][(]\d+[)]|(ab)[(]\d+[)]|(abcd)[(]\d+[)]
I want to find and return instances a(\d+)
, b(\d+)
, c(\d+)
, d(\d+)
, ab(\d+)
, or abcd(\d+)
expressions = re.findall(r"[abcd][(]\d+[)]|(ab)[(]\d+[)]|(abcd)[(]\d+[)]",line)
print(expressions)
I think it might be working because when I have something in the string that should match the pattern I get [('', '')]
as my output instead of []
Any thoughts?
Upvotes: 1
Views: 56
Reputation: 9930
expressions = re.findall(r"[abcd](\d+)|(ab)(\d+)|(abcd)(\d+)",line)
print(expressions)
This should work.
problem was: In Python, you don't put (
in brackets.
If you mean the parentheses in (d+)
literally, you have to use escaped paranthese \(
and \)
.
Be aware, if you put paranthese around ab
or abcd
they will be listed when referencing to groupings.
I would not put parantheses as long as it is not necessary.
expressions = re.findall(r"[abcd]\(\d+\)|ab\(\d+\)|abcd\(\d+\)",line)
print(expressions)
If you want to just match a1236
, ab12
, abcd12342
, then use
expressions = re.findall(r"[abcd]\d+|ab\d+|abcd\d+",line)
print(expressions)
However, if you want to capture certain parts with their repetitions, put parantheses around them.
Upvotes: 2
Reputation: 2596
I think you misused [(]
or [)]
.
And \(\d+\)
part is redundant. So you can optimize it:
import re
line = 'a(123) b(11) ab(35) bc(45) abcd(1234)'
expressions = re.findall(
r'(?:[abcd]|ab|abcd)\(\d+\)',
line)
print(expressions)
output:
['a(123)', 'b(11)', 'ab(35)', 'c(45)', 'abcd(1234)']
explanation:
(?:...)
is non-capturing group. It is only for grouping not capturing.\(
and \)
: \
is escape character for special characters like (
or )
. \(
matches literal (
.Upvotes: 3