Reputation: 64014
I have a chunked data that looks like this:
>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463
>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276
Each chunk are white-space separated. What I want to do is to transform them in to a nested dictionary that looks like this:
{ 'Head1': {'foo': '0 1.10699e-05 2.73049e-05',
'bar': '0.939121 0.0173732 0.0119144',
'qux': '0 2.34787e-05 0.0136463'},
'Head2': {'foo': '0 0.00118929 0.00136993',
'bar': '0.0610655 0.980495 0.997179',
'qux': '0.060879 0.982591 0.974276'}
}
What's the way to do it in Python? I'm not sure how to go from here:
def parse():
caprout="tmp.txt"
with open(caprout, 'r') as file:
datalines = (ln.strip() for ln in file)
for line in datalines:
if line.startswith(">Head"):
print line
elif not line.strip():
print line
else:
print line
return
def main()
parse()
return
if __name__ == '__main__'
parse()
Upvotes: 0
Views: 77
Reputation: 3880
File:
[sgeorge@sgeorge-ld1 tmp]$ cat tmp.txt
>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463
>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276
Script:
[sgeorge@sgeorge-ld1 tmp]$ cat a.py
import json
dict_ = {}
def parse():
caprout="tmp.txt"
with open(caprout, 'r') as file:
datalines = (ln.strip() for ln in file)
for line in datalines:
if line != '':
if line.startswith(">Head"):
key = line.replace('>','')
dict_[key] = {}
else:
nested_key = line.split(' ',1)[0]
value = line.split(' ',1)[1]
dict_[key][nested_key] = value
print json.dumps(dict_)
parse()
Execution:
[sgeorge@sgeorge-ld1 tmp]$ python a.py | python -m json.tool
{
"Head1": {
"bar": "0.939121 0.0173732 0.0119144",
"foo": "0 1.10699e-05 2.73049e-05",
"qux": "0 2.34787e-05 0.0136463"
},
"Head2": {
"bar": "0.0610655 0.980495 0.997179",
"foo": "0 0.00118929 0.00136993",
"qux": "0.060879 0.982591 0.974276"
}
}
Upvotes: 1
Reputation: 354
This is the simplest solution I could come up with on the top of my head :
mainDict = dict()
file = open(filename, 'r')
for line in file:
line = line.strip()
if line == "" :
continue
if line.find("Head") :
lastBlock = line
mainDict[lastBlock] = dict()
continue
splitLine = line.partition(" ")
mainDict[lastBlock][splitLine[0]] = splitLine[2]
Upvotes: 1