Reputation: 3
In python need to flatten a large nested Dictionary that starts like this:
{u'February 19, 2016': {'calls': [{'%change': u'0.00%',
'ask': u'6.50',
'bid': u'5.20',
'change': u'0.00',
'interest': u'10',
'last': u'10.30',
'name': u'LVLT160219C00044000',
'strike': u'44.00',
'volatility': u'62.31%',
'volume': u'10'}]}}
into a DataFrame with columns similar to:
name strike date type last ask bid change volume volatility
Thanks
Upvotes: 0
Views: 316
Reputation: 645
I would loop through your structure and cast it into a new format that pandas can automatically recognize, like a sequence of dicts. You'll have to customize it for your exact needs but this is a proof of concept based on your current data structure.
import pandas as pd
#your data looks something like this:
mydict={"date1" : {"calls": [ {"change":1,"ask":4,"bid":5,"name":"x83"},
{"change":3,"ask":9,"bid":2,"name":"y99"} ] },
"date2" : {"calls": [ {"change":4,"ask":3,"bid":7,"name":"z32"} ] } }
def convert(something):
# Convert strings to floats, unless they're really strings
try: return float(something)
except ValueError: return something
# make an empty sequence
dataseq = []
# list the fields you want from each call
desired = ["ask","change","bid","name"]
for thisdate in mydict.keys():
# get the calls for each date
for thiscall in mydict[thisdate]["calls"]:
# initialize a new dictionary for this call with the entry date
d = {"date":thisdate}
for field in desired:
# get the data and convert to float if it's a float
d[field]=convert(thiscall[field])
# add it to your sequence
dataseq.append(d)
# make a dataframe
a = pd.DataFrame(dataseq)
Upvotes: 1