oorahduc
oorahduc

Reputation: 185

Python Regex match string between specific string and end character

I am building a file stripper to build a config report, and I have a very very long string as my base data. The following is a very small snippet of it, but it at least illustrates what I'm working with.

Snippet Example: DEFAULT_GATEWAY=192.168.88.1&DELVRY_AGGREGATION_INTERVAL0=1&DELVRY_AGGREGATION_INTERVAL1=1&DELVRY_SCHEDULE0=1&DELVRY_SNI0=192.168.88.158&DELVRY_USE_SSL_TLS1=0&

How would I go about matching the following:

between "DEFAULT_GATEWAY=" and "&"
between "DELVRY_AGGREGATION_INTERVAL0=" and "&"
between "DELVRY_AGGREGATION_INTERVAL1=" and "&"
between "DELVRY_SCHEDULE=" and "&"
between "DELVRY_SNI0=" and "&"
between "DELVRY_USE_SSL_TLS1=" and "&"

and building a dict with it like:

{"DEFAULT_GATEWAY":"192.168.88.1",
 "DELVRY_AGGREGATION_INTERVAL0":"1",
 "DELVRY_AGGREGATION_INTERVAL1":"1",
 "DELVRY_SCHEDULE0":"1",
 "DELVRY_SNI0":"0",
 "DELVRY_USE_SSL_TLS1":"0"}

?

Upvotes: 1

Views: 69

Answers (3)

Hai Vu
Hai Vu

Reputation: 40688

Here is a way to do it.

In [1]: input = 'DEFAULT_GATEWAY=192.168.88.1&DELVRY_AGGREGATION_INTERVAL0=1&DELVRY_AGGREGATION_INTERVAL1=1&DELVRY_SCHEDULE0=1&DELVRY_SNI0=192.168.88.158&DELVRY_USE_SSL_TLS1=0&'

In [2]: input.split('&')
Out[2]: 
['DEFAULT_GATEWAY=192.168.88.1',
 'DELVRY_AGGREGATION_INTERVAL0=1',
 'DELVRY_AGGREGATION_INTERVAL1=1',
 'DELVRY_SCHEDULE0=1',
 'DELVRY_SNI0=192.168.88.158',
 'DELVRY_USE_SSL_TLS1=0',
 '']

In [3]: [keyval.split('=') for keyval in input.split('&') if keyval]
Out[3]: 
[['DEFAULT_GATEWAY', '192.168.88.1'],
 ['DELVRY_AGGREGATION_INTERVAL0', '1'],
 ['DELVRY_AGGREGATION_INTERVAL1', '1'],
 ['DELVRY_SCHEDULE0', '1'],
 ['DELVRY_SNI0', '192.168.88.158'],
 ['DELVRY_USE_SSL_TLS1', '0']]

In [4]: dict(keyval.split('=') for keyval in input.split('&') if keyval)
Out[4]: 
{'DEFAULT_GATEWAY': '192.168.88.1',
 'DELVRY_AGGREGATION_INTERVAL0': '1',
 'DELVRY_AGGREGATION_INTERVAL1': '1',
 'DELVRY_SCHEDULE0': '1',
 'DELVRY_SNI0': '192.168.88.158',
 'DELVRY_USE_SSL_TLS1': '0'}

Notes

  1. This is the input line
  2. Split by & to get pairs of key-values. Note the last entry is empty
  3. Split each entry by the equal sign while throwing away empty entries
  4. Build a dictionary

Another Solution

In [8]: import urlparse

In [9]: urlparse.parse_qsl(input)
Out[9]: 
[('DEFAULT_GATEWAY', '192.168.88.1'),
 ('DELVRY_AGGREGATION_INTERVAL0', '1'),
 ('DELVRY_AGGREGATION_INTERVAL1', '1'),
 ('DELVRY_SCHEDULE0', '1'),
 ('DELVRY_SNI0', '192.168.88.158'),
 ('DELVRY_USE_SSL_TLS1', '0')]

In [10]: dict(urlparse.parse_qsl(input))
Out[10]: 
{'DEFAULT_GATEWAY': '192.168.88.1',
 'DELVRY_AGGREGATION_INTERVAL0': '1',
 'DELVRY_AGGREGATION_INTERVAL1': '1',
 'DELVRY_SCHEDULE0': '1',
 'DELVRY_SNI0': '192.168.88.158',
 'DELVRY_USE_SSL_TLS1': '0'}

Upvotes: 3

Alex Martelli
Alex Martelli

Reputation: 881555

import re

keys = {"DEFAULT_GATEWAY",
    "DELVRY_AGGREGATION_INTERVAL0",
    "DELVRY_AGGREGATION_INTERVAL1",
    "DELVRY_SCHEDULE0",
    "DELVRY_SNI0",
    "DELVRY_USE_SSL_TLS1"}
resdict = {}
for k in keys:
    pat = '{}([^&])&'.format(k)
    mo = re.search(pat, bigstring)
    if mo is None: continue  # no match
    resdict[k] = mo.group(1)

will leave your desired result in resdict, if bigstring is the string you're searching in.

This assumes you know in advance which keys you'll be looking for, and you keep them in a set keys. If you don't know in advance the keys of interest, that's a very different issue of course.

Upvotes: 0

Jared
Jared

Reputation: 521

Split first by '&' to get a list of strings, then by '=', like this:

d = dict(kv.split('=') for kv in line.split('&'))

Upvotes: 0

Related Questions