f.rodrigues
f.rodrigues

Reputation: 3587

Regex Avoid Unecessary Groups

I have this code:

string = """a = 10 + 15
b = 50 + b
c = a + b
d = c + 50"""


letter = "([a-z])"
signs = "(\+|\-|\*|\/)"
regex = re.compile(r"{0} = (\d+) {1} (\d+)|"
                   r"{0} = (\d+) {1} {0}|"
                   r"{0} = {0} {1} {0}|"
                   r"{0} = {0} {1} (\d+)".format(letter, signs))signs))(\d+)".format(letter,signs))

If I do re.search(regex,string).groups() I end up with

('a', '10', '+', '15', None, None, None, None, None, None, None, None, None, None, None, None)
(None, None, None, None, 'b', '50', '+', 'b', None, None, None, None, None, None, None, None)
(None, None, None, None, None, None, None, None, 'c', 'a', '+', 'b', None, None, None, None)
(None, None, None, None, None, None, None, None, None, None, None, None, 'd', 'c', '+', '50')

But I want just 4 groups. [var,val1,operator,val2]

I'm using a list comprehension

[r for r in re.search(regex,string).groups() if r != None]

But I wonder if there's a way to do this in the regex itself.

Upvotes: 1

Views: 64

Answers (2)

RevanProdigalKnight
RevanProdigalKnight

Reputation: 1326

It's best to simplify the regex from four separate statements to one slightly overloaded statement in this case, which does require modifying letter:

letter = "[a-z]"
signs = "(\+|\-|\*|\/)"
regex = re.compile(r"({0}) = (\d+|{0}) {1} (\d+|{0})".format(letter, signs))signs))(\d+)".format(letter,signs))

Upvotes: 2

Joey
Joey

Reputation: 354356

You can use (?:...) as a non-capturing group for things that need grouping, but not capturing. E.g.:

signs = "(?:\+|\-|\*|\/)"

However, you can get rid of plenty of them by just not making signs and letter groups in the first place:

letter = "[a-z]"
signs = "[+*/-]"

Upvotes: 0

Related Questions