Jadon Erwin
Jadon Erwin

Reputation: 661

Trouble implementing Regex in Python

I'm having trouble implementing a regex pattern in python. My expression works in on https://regexr.com/ but I can't get it to work in python.

Here is my expression: [abcd][(]\d+[)]|(ab)[(]\d+[)]|(abcd)[(]\d+[)] I want to find and return instances a(\d+), b(\d+), c(\d+), d(\d+), ab(\d+), or abcd(\d+)

expressions = re.findall(r"[abcd][(]\d+[)]|(ab)[(]\d+[)]|(abcd)[(]\d+[)]",line)
print(expressions)

I think it might be working because when I have something in the string that should match the pattern I get [('', '')] as my output instead of []

Any thoughts?

Upvotes: 1

Views: 56

Answers (2)

Gwang-Jin Kim
Gwang-Jin Kim

Reputation: 9930

expressions = re.findall(r"[abcd](\d+)|(ab)(\d+)|(abcd)(\d+)",line)
print(expressions)

This should work. problem was: In Python, you don't put ( in brackets. If you mean the parentheses in (d+) literally, you have to use escaped paranthese \( and \).

Be aware, if you put paranthese around ab or abcd they will be listed when referencing to groupings. I would not put parantheses as long as it is not necessary.

expressions = re.findall(r"[abcd]\(\d+\)|ab\(\d+\)|abcd\(\d+\)",line)
print(expressions)

If you want to just match a1236, ab12, abcd12342, then use

expressions = re.findall(r"[abcd]\d+|ab\d+|abcd\d+",line)
print(expressions)

However, if you want to capture certain parts with their repetitions, put parantheses around them.

Upvotes: 2

Boseong Choi
Boseong Choi

Reputation: 2596

I think you misused [(]or [)]. And \(\d+\) part is redundant. So you can optimize it:

import re

line = 'a(123) b(11) ab(35) bc(45) abcd(1234)'

expressions = re.findall(
    r'(?:[abcd]|ab|abcd)\(\d+\)',
    line)
print(expressions)

output:

['a(123)', 'b(11)', 'ab(35)', 'c(45)', 'abcd(1234)']

explanation:

  • (?:...) is non-capturing group. It is only for grouping not capturing.
  • \( and \): \ is escape character for special characters like ( or ). \( matches literal (.

Upvotes: 3

Related Questions