Python, Nested Dictionary to Dataframe

Question

In python need to flatten a large nested Dictionary that starts like this:

{u'February 19, 2016': {'calls': [{'%change': u'0.00%',
'ask': u'6.50',
'bid': u'5.20',
'change': u'0.00',
'interest': u'10',
'last': u'10.30',
'name': u'LVLT160219C00044000',
'strike': u'44.00',
'volatility': u'62.31%',
'volume': u'10'}]}}

into a DataFrame with columns similar to:

name  strike  date  type   last  ask bid  change  volume  volatility

Thanks

22degrees · Accepted Answer

I would loop through your structure and cast it into a new format that pandas can automatically recognize, like a sequence of dicts. You'll have to customize it for your exact needs but this is a proof of concept based on your current data structure.

import pandas as pd
#your data looks something like this:
mydict={"date1" : {"calls": [ {"change":1,"ask":4,"bid":5,"name":"x83"},
                              {"change":3,"ask":9,"bid":2,"name":"y99"} ] },
        "date2" : {"calls": [ {"change":4,"ask":3,"bid":7,"name":"z32"} ] } }

def convert(something):
    # Convert strings to floats, unless they're really strings
    try: return float(something)
    except ValueError: return something

# make an empty sequence
dataseq = []
# list the fields you want from each call
desired = ["ask","change","bid","name"]

for thisdate in mydict.keys():
    # get the calls for each date
    for thiscall in mydict[thisdate]["calls"]:
        # initialize a new dictionary for this call with the entry date
        d = {"date":thisdate}
        for field in desired:
            # get the data and convert to float if it's a float
            d[field]=convert(thiscall[field])
        # add it to your sequence
        dataseq.append(d)
# make a dataframe
a = pd.DataFrame(dataseq)

Python, Nested Dictionary to Dataframe

Answers (1)

Related Questions