Miroslav
Miroslav

Reputation: 33

Parsing key values in string

I have a string that I am getting from a command line application. It has the following structure:

-- section1 --
item11|value11
item12|value12
item13

-- section2 --
item21|value21
item22

what I would like is to parse this to a dict so that I can easily access the values with:

d['section1']['item11']

I already solved it for the case when there are no sections and every key has a value but I get errors otherwise. I have tried a couple things but it is getting complicated because and nothing seems to work. This is what I have now:

s="""
item11|value11
item12|value12
item21|value21
"""
d = {}
for l in s.split('\n'):
    print(l, l.split('|'))
    if l != '':
        d[l.split('|')[0]] = l.split('|')[1]

Can somebody help me extend this for the section case and when no values are present?

Upvotes: 3

Views: 392

Answers (2)

Kroltan
Kroltan

Reputation: 5156

Regexes are a good take at this:

import re


def parse(data):
    lines = data.split("\n") #split input into lines
    result = {}
    current_header = ""

    for line in lines:
        if line: #if the line isn't empty
            #tries to match anything between double dashes:
            match = re.match(r"^-- (.*) --$", line)
            if match: #true when the above pattern matches
                #grabs the part inside parentheses:
                current_header = match.group(1)
            else:
                #key = 1st element, value = 2nd element:
                key, value = line.split("|")
                #tries to get the section, defaults to empty section:
                section = result.get(current_header, {})
                section[key] = value #adds data to section
                result[current_header] = section #updates section into result
    return result #done.

print parse("""
-- section1 --
item1|value1
item2|value2
-- section2 --
item1|valueA
item2|valueB""")

Upvotes: 1

elyase
elyase

Reputation: 40973

Seems like a perfect fit for the ConfigParser module in the standard library:

d = ConfigParser(delimiters='|', allow_no_value=True)
d.SECTCRE = re.compile(r"-- *(?P<header>[^]]+?) *--")  # sections regex
d.read_string(s)

Now you have an object that you can access like a dictionary:

>>> d['section1']['item11']
'value11'
>>> d['section2']['item22']   # no value case
None

Upvotes: 5

Related Questions