Reputation: 482
I have the following list that I would like to convert into a pandas dataframe:
data = [
{'ts_raw_c:TSLA': [
{'ticker': 'TSLA', 'type': 'close'}, [
(1546405200000, 62.024),
(1546491600000, 60.072)
]
]
},
{'ts_raw_h:TSLA': [
{'ticker': 'TSLA', 'type': 'high'}, [
(1546405200000, 63.026),
(1546491600000, 61.88)
]
]
},
{'ts_raw_l:TSLA': [
{'ticker': 'TSLA', 'type': 'low'}, [
(1546405200000, 59.76),
(1546491600000, 59.476)
]
]
},
{'ts_raw_o:TSLA': [
{'ticker': 'TSLA', 'type': 'open'}, [
(1546405200000, 61.22),
(1546491600000, 61.4)
]
]
}
]
desired dataframe output
close high low open
1546405200000 62.024 63.026 59.76 61.22
1546491600000 60.07 61.88 59.476 61.4
I think the appropriate way to create the dataframe is like so:
df = pandas.DataFrame(df_column_values, index=df_index, columns=df_column_names)
To that end, the following code is able to create df_index
and df_column_names
properly, but I'm having a block though, wrapping my head around the code that I need to parse through each nested list of dictionaries and their list of tuples to piece together df_column_values
.
My attempts always seem to produce results that circle back to nested lists that are as wide as the number of indexes, not as wide as the number of columns.
# so.py
df_index = []
df_column_names = []
df_column_values = []
all_price_values_per_price_label = {}
all_price_labels = []
for line in data:
for key_name in line.keys():
price_label = line[key_name][0]['type']
df_column_names.append(price_label)
all_price_values_per_price_label[price_label] = []
for items in line[key_name][1]:
df_index.append(items[0]) if items[0] not in df_index else None # timestamp
all_price_values_per_price_label[price_label].append(items[1])
for price_label in all_price_values_per_price_label:
all_price_labels.append(price_label)
for price_label in all_price_values_per_price_label:
df_column_values.append(all_price_values_per_price_label[price_label])
print(df_index)
print(df_column_names)
print(df_column_values)
# python3 so.py
[1546405200000, 1546491600000]
['close', 'high', 'low', 'open']
[[62.024, 60.072], [63.026, 61.88], [59.76, 59.476], [61.22, 61.4]]
df_column_values
would need to look like so to be valid:
df_column_values = [[62.024, 63.026, 59.76, 61.22], [60.072, 61.88, 59.476, 61.4]]
Upvotes: 0
Views: 105
Reputation: 1826
you could use dictionary comprehension:
from pandas import DataFrame
df = DataFrame(dict(zip(df_column_names,df_column_values)),index=df_index)
print(df)
Output:
close high low open
1546405200000 62.024 63.026 59.760 61.22
1546491600000 60.072 61.880 59.476 61.40
Upvotes: 1