NoobTom
NoobTom

Reputation: 555

How to parse with one regular expression this string in Python

I need to parse this string, with only one regular expression in Python. For every group I need to save the value in a specific field. The problem is that one or more of the parameters may be missing or be in a different order. (i.e. domain 66666 ip nonce, with the middle part missing)

3249dsf 2013-02-10T06:44:30.666821+00:00 domain constant 66666 sync:[127.0.0.1] Request: pubvalue=kjiduensofksidoposiw&change=09872534&value2=jdmcnhj&counter=232&value3=2&nonce=7896089hujoiuhiuh098h

I need to assign:

EDIT

This is an example on how the string can vary: 123dsf 2014-01-11T06:49:30.666821+00:00 google constant 12356 sync:[192.168.0.1] Request: pubvalue=fggggggeesidoposiw&nonce=7896089hujoiuhiuh098h

Thank you in advance for showing me the way.

Upvotes: 1

Views: 150

Answers (1)

tuxtimo
tuxtimo

Reputation: 2790

It's probably not a good idea to use one regex to parse the whole string. but I think the solution is to use named groups (see: Named groups on Regex Tutorial. Named groups can be captured by (?P<nameofgroup>bla)

So you can match for example the ip with:

import re
str = "3249dsf 2013-02-10T06:44:30.666821+00:00 domain constant 66666 sync:[127.0.0.1] Request: pubvalue=kjiduensofksidoposiw&change=09872534&value2=jdmcnhj&counter=232&value3=2&nonce=7896089hujoiuhiuh098h"
print re.search("\[(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]", str).groupdict()

Just extend this Regular expression with the patterns you need to match the other stuff.

and you can make the groups optional with placing a ? after the group's parantheses, like: (?P<ip>pattern)?. If a pattern could not be matched the element in the dict will be None.

But notice: It is not a good idea to do this in only one Regex. It will be slow (because of backtracking and stuff) and the Regex will be long and complicated to maintain!

Upvotes: 1

Related Questions