Reputation: 6267
According to the pandas documentation for read_excel, I can put the index column names on a separate line and then the method will which columns should be used as indices.
I want to create an Excel file from a multi-indexed dataframe that can be read in as such, but I can't figure out how to get pandas to write to_excel in such a way that this additional row is created (from a multi-indexed dataframe).
I can't imagine that storing a multi-indexed dataframe as an Excel worksheet and then pulling it back in later is that uncommon a use case, so I'm wondering if I just haven't figured out how to do this.
Here's an example of a dataframe I'd like to 'freeze' in Excel before reading back in without having to tell read_excel which columns are the indices:
ipdb> my_df
Date Amount
Rec Section Row
0 Top Section 2 2015-05-01 -105.00
1 Middle Section 3 2015-05-04 90247.60
2 Middle Section 4 2015-05-05 -2992.99
3 Bottom Section 5 2015-05-08 -800.00
In my example, there are three index columns: Rec, Section, and Row.
When I write this to Excel and then read it back in, I don't want to have to tell it this. Since read_excel seems to have a method that infers the index names when they appear on a separate row, I want to have it just figure it out (assuming I correctly write the Excel file).
What am I missing?
Upvotes: 3
Views: 1080
Reputation: 21
I was encountering the same issue when trying to write a pivot table to Excel. I was able to get this to work by modifying the frame.py
file in ../pandas/core
. Changing if self.columns.nlevels > 1
to if self.columns.nlevels > 1 and not index
got me what I needed.
As this functionality is still not supported by Pandas
, you may still encounter funny output. Also, this will likely not solve the issue for read_excel
either. Hopefully this helps a little!
I referenced 'onesandzeros' in his GitHub comment.
Upvotes: 2