Daventure
Daventure

Reputation: 23

Convert a dictionary of a list of dictionaries to pandas DataFrame

I pulled a list of historical option price of AAPL from the RobinHoood function robin_stocks.get_option_historicals(). The data was returned in a form of dictional of list of dictionary as shown below.

I am having difficulties to convert the below object (named historicalData) into a DataFrame. Can someone please help?

historicalData = {'data_points': [{'begins_at': '2020-10-05T13:30:00Z',
   'open_price': '1.430000',
   'close_price': '1.430000',
   'high_price': '1.430000',
   'low_price': '1.430000',
   'volume': 0,
   'session': 'reg',
   'interpolated': False},
{'begins_at': '2020-10-05T13:40:00Z',
   'open_price': '1.430000',
   'close_price': '1.340000',
   'high_price': '1.440000',
   'low_price': '1.320000',
   'volume': 0,
   'session': 'reg',
   'interpolated': False}],
'open_time': '0001-01-01T00:00:00Z',
'open_price': '0.000000',
'previous_close_time': '0001-01-01T00:00:00Z',
'previous_close_price': '0.000000',
'interval': '10minute',
'span': 'week',
'bounds': 'regular',
'id': '22b49380-8c50-4c76-8fb1-a4d06058f91e',
'instrument': 'https://api.robinhood.com/options/instruments/22b49380-8c50-4c76-8fb1-a4d06058f91e/'}

I tried the below code code but that didn't help:

import pandas as pd
df = pd.DataFrame(historicalData)
df

Upvotes: 1

Views: 143

Answers (2)

Valdi_Bo
Valdi_Bo

Reputation: 31011

You didn't write that you want only data_points (as in the other answer), so I assume that you want your whole dictionary converted to a DataFrame.

To do it, start with your code:

df = pd.DataFrame(historicalData)

It creates a DataFrame, with data_points "exploded" to consecutive rows, but they are still dictionaries.

Then rename open_price column to open_price_all:

df.rename(columns={'open_price': 'open_price_all'}, inplace=True)

The reason is to avoid duplicated column names after join to be performed soon (data_points contain also open_price attribute and I want the corresponding column from data_points to "inherit" this name).

The next step is to create a temporary DataFrame - a split of dictionaries in data_points to individual columns:

wrk = df.data_points.apply(pd.Series)

Print wrk to see the result.

And the last step is to join df with wrk and drop data_points column (not needed any more, since it was split into separate columns):

result = df.join(wrk).drop(columns=['data_points'])

Upvotes: 2

Issei Kumagai
Issei Kumagai

Reputation: 685

This is pretty easy to solve with the below. I have chucked the dataframe to a list via list comprehension

import pandas as pd
df_list = [pd.DataFrame(dic.items(), columns=['Parameters', 'Value']) for dic in historicalData['data_points']]

You then could do:

df_list[0]

which will yield

     Parameters                 Value
0     begins_at  2020-10-05T13:30:00Z
1    open_price              1.430000
2   close_price              1.430000
3    high_price              1.430000
4     low_price              1.430000
5        volume                     0
6       session                   reg
7  interpolated                 False

Upvotes: 1

Related Questions