HaPsantran
HaPsantran

Reputation: 6267

How can I preserve a pandas multi-index between a to_excel() and a read_excel()?

According to the pandas documentation for read_excel, I can put the index column names on a separate line and then the method will which columns should be used as indices.

I want to create an Excel file from a multi-indexed dataframe that can be read in as such, but I can't figure out how to get pandas to write to_excel in such a way that this additional row is created (from a multi-indexed dataframe).

I can't imagine that storing a multi-indexed dataframe as an Excel worksheet and then pulling it back in later is that uncommon a use case, so I'm wondering if I just haven't figured out how to do this.

Here's an example of a dataframe I'd like to 'freeze' in Excel before reading back in without having to tell read_excel which columns are the indices:

ipdb> my_df
                             Date         Amount
Rec Section             Row                         
0   Top Section         2    2015-05-01      -105.00
1   Middle Section      3    2015-05-04     90247.60
2   Middle Section      4    2015-05-05     -2992.99
3   Bottom Section      5    2015-05-08      -800.00

In my example, there are three index columns: Rec, Section, and Row.

When I write this to Excel and then read it back in, I don't want to have to tell it this. Since read_excel seems to have a method that infers the index names when they appear on a separate row, I want to have it just figure it out (assuming I correctly write the Excel file).

What am I missing?

Upvotes: 3

Views: 1080

Answers (1)

Tyler-B
Tyler-B

Reputation: 21

I was encountering the same issue when trying to write a pivot table to Excel. I was able to get this to work by modifying the frame.py file in ../pandas/core. Changing if self.columns.nlevels > 1 to if self.columns.nlevels > 1 and not index got me what I needed.

As this functionality is still not supported by Pandas, you may still encounter funny output. Also, this will likely not solve the issue for read_excel either. Hopefully this helps a little!

I referenced 'onesandzeros' in his GitHub comment.

Upvotes: 2

Related Questions