Transforming non-tabular/chunked data into a nested dictionary in Python

Question

I have a chunked data that looks like this:

>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463

>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276

Each chunk are white-space separated. What I want to do is to transform them in to a nested dictionary that looks like this:

{ 'Head1': {'foo': '0 1.10699e-05 2.73049e-05',
            'bar': '0.939121 0.0173732 0.0119144',
            'qux': '0 2.34787e-05 0.0136463'},
  'Head2': {'foo': '0 0.00118929 0.00136993',
             'bar': '0.0610655 0.980495 0.997179',
             'qux': '0.060879 0.982591 0.974276'}
}

What's the way to do it in Python? I'm not sure how to go from here:

def parse():
    caprout="tmp.txt"
    with open(caprout, 'r') as file:
        datalines = (ln.strip() for ln in file)
        for line in datalines:
            if line.startswith(">Head"):
                print line
            elif not line.strip():
                print line
            else:
                print line
    return

def main()
    parse()
    return 

if __name__ == '__main__'
parse()

MultiVAC · Accepted Answer

This is the simplest solution I could come up with on the top of my head :

mainDict = dict()
file = open(filename, 'r')
for line in file:
    line = line.strip()
    if line == "" :
        continue
    if line.find("Head") :
        lastBlock = line
        mainDict[lastBlock] = dict()
        continue
    splitLine = line.partition(" ")
    mainDict[lastBlock][splitLine[0]] = splitLine[2]

Transforming non-tabular/chunked data into a nested dictionary in Python

Answers (2)

Related Questions