Ignacio Verona
Ignacio Verona

Reputation: 655

Python regexp multiple expressions with grouping

I'm trying to match the output given by a Modem when asked about the network info, it looks like this:

Network survey started...

For BCCH-Carrier:
arfcn: 15,bsic: 4,dBm: -68

For non BCCH-Carrier:
arfcn: 10,dBm: -72
arfcn: 6,dBm: -78
arfcn: 11,dBm: -81
arfcn: 14,dBm: -83
arfcn: 16,dBm: -83

So I've two types of expressions to match, the BCCH and non BCCH. the following code is almost working:

match = re.findall('(?:arfcn: (\d*),dBm: (-\d*))|(?:arfcn: (\d*),bsic: (\d*),dBm: (-\d*))', data)

But it seems that BOTH expressions are being matched, and not found fields left blank:

>>> match
[('', '', '15', '4', '-68'), ('10', '-72', '', '', ''), ('6', '-78', '', '', ''), ('11', '-81', '', '', ''), ('14', '-83', '', '', ''), ('16', '-83', '', '', '')]

May anyone help? Why such behaviour? I've tried changing the order of the expressions, with no luck.

Thanks!

Upvotes: 0

Views: 81

Answers (2)

Kaivosukeltaja
Kaivosukeltaja

Reputation: 15735

That is how capturing groups work. Since you have five of them, there will always be five parts returned.

Based on your data, I think you could simplify your regex by making the bsic part optional. That way each row would return three parts, the middle one being empty for non BCCH-Carriers.

match = re.findall('arfcn: (\d*)(?:,bsic: (\d*))?,dBm: (-\d*)', data)

Upvotes: 1

Martijn Pieters
Martijn Pieters

Reputation: 1121634

You have an expression with 5 groups.

The fact that you have 2 of those in one optional part and the other 3 in a mutually exclusive other part of your expression doesn't change that fact. Either 2 or 3 of the groups are going to be empty, depending on what line you matched.

If you have to match either line with one expression, there is no way around this. You can use named groups (and return a dictionary of matched groups) to make this a little easier to manage, but you will always end up with empty groups.

Upvotes: 1

Related Questions