Reputation: 5071
I can not find a way to save properly and retrieve properly a multi-indexed pandas dataframe, so that to retain the multi-indexed column structure. For a reproducible example:
toy_data.to_json()
'{"["GOOG","Shares"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":694548763.0,"1522540800000":null},"["GOOG","ROE"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":0.1076,"1522540800000":null},"["FB","Shares"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":2398606201.0,"1522540800000":null},"["FB","ROE"]":{"1521849600000":null,"1521936000000":null,"1522368000000":null,"1522454400000":0.2465,"1522540800000":null}}'
toy_data.to_csv('toy_data.csv')
toy_data1 = pd.read_csv('toy_data.csv')
Upvotes: 0
Views: 70
Reputation: 294298
read_csv
Using the header
and index_col
arguments in read_csv
will get you what you need.
toy_data.to_csv('sample.csv')
pd.read_csv('sample.csv', header=[0, 1], index_col=[0])
Company GOOG FB
Indicators Shares ROE Shares ROE
Quarter_end
2018-03-24 NaN NaN NaN NaN
2018-03-25 NaN NaN NaN NaN
2018-03-30 NaN NaN NaN NaN
2018-03-31 1.0 2.0 3.0 4.0
2018-04-01 NaN NaN NaN NaN
read_hdf
Saving to hdf
might be a better option.
toy_data.to_hdf('sample.h5', 'toy_key')
pd.read_hdf('sample.h5', 'toy_key')
Company GOOG FB
Indicators Shares ROE Shares ROE
Quarter_end
2018-03-24 NaN NaN NaN NaN
2018-03-25 NaN NaN NaN NaN
2018-03-30 NaN NaN NaN NaN
2018-03-31 1.0 2.0 3.0 4.0
2018-04-01 NaN NaN NaN NaN
cols = pd.MultiIndex.from_product(
[['GOOG', 'FB'], ['Shares', 'ROE']],
names=['Company', 'Indicators']
)
idx = pd.to_datetime(
['2018-03-24', '2018-03-25', '2018-03-30',
'2018-03-31', '2018-04-01']
).rename('Quarter_end')
toy_data = pd.DataFrame([
[np.nan, np.nan, np.nan, np.nan],
[np.nan, np.nan, np.nan, np.nan],
[np.nan, np.nan, np.nan, np.nan],
[1, 2, 3, 4],
[np.nan, np.nan, np.nan, np.nan],
], idx, cols)
Upvotes: 2
Reputation: 5460
You haven't provided usable sample data, but I'm fairly certain that all you need to do is pass header=[0, 1]
, and index_col=0
as arguments to read_csv
.
Upvotes: 0