IUnknown
IUnknown

Reputation: 9819

splitting a braces grouped string in python

Appreciate help in a one-liner idiom to do the following efficiently.

I have a string with groups separated by braces as below:

{1:xxxx}{2:xxxx}{3:{10:xxxx}}{4:xxxx\r\n:xxxx}....  

How do I convert this into a dictionary format?

dict={1:'xxx',2:'xxxx',3:'{10:xxxx}'},4:'xxxx\r\n:xxxx'}  

Upvotes: 4

Views: 1372

Answers (2)

georg
georg

Reputation: 214959

r = """(?x)
{
    (\w+)
    :
    (
        (?:
            [^{}]
            |
            {.+?}
        )+
    )
}
"""

z = "{1:xxxx}{2:xxxx}{3:{10:xxxx}}{4:'xxxx'}"
print dict(re.findall(r, z))

# {'1': 'xxxx', '3': '{10:xxxx}', '2': 'xxxx', '4': "'xxxx'"}

Feel free to convert to an one-liner if you want - just remove (?x) and all whitespace from the regex.

The above parses only one level of nesting, to handle arbitrary depths consider the more advanced regex module that supports recursive patterns:

import regex

r = """(?x)
{
    (\w+)
    :
    (
        (?:
            [^{}]
            |
            (?R)
        )+
    )
}
"""

z = "{1:abc}{2:{3:{4:foo}}}{5:bar}"
print dict(regex.findall(r, z))

# {'1': 'abc', '2': '{3:{4:foo}}', '5': 'bar'}

Upvotes: 4

Jochen Ritzel
Jochen Ritzel

Reputation: 107638

This is how I would do it:

raw = """{1:xxxx}{2:xxxx}{3:{10:xxxx}}{4:'xxxx\r\n:xxxx'}"""

def parse(raw):
    # split into chunks by '}{' and remove the outer '{}'
    parts = raw[1:-1].split('}{')
    for part in parts:
        # split by the first ':'
        num, data = part.split(':', 1)
        # yield each entry found
        yield int(num), data

# make a dict from it
print dict(parse(raw))

It keeps the '{10:xxxx}' as a string just like in your example.

Upvotes: 0

Related Questions