dav3ypants
dav3ypants

Reputation: 68

How to retain blank index values in Multilevel index when reading from Excel with Python and pandas?

I have an Excel file that looks like the following: enter image description herethat I'm reading in with pandas as follows:

data = pd.read_excel(excelFile,
                     sheetname=sheet,
                     skiprows=[0,1],
                     header=[0],
                     index_col=list(range(0,2)))

The problem is that I have blank index values (highlighted in yellow) that pandas ends up filling down with the index column names like so: enter image description here
How can I prevent it from doing this?

Upvotes: 0

Views: 244

Answers (1)

KRKirov
KRKirov

Reputation: 4004

You can achieve the desired outcome if you fill the empty cells in your file with a space character. Alternatively, you can specify fewer options when reading your file and use simple post-processing:

    import pandas as pd

    df = pd.read_excel('C:\problem.xlsx', skiprows = [0, 1])

    # Replace NaNs with a space character
    df = df.fillna(' ')

   df = df.set_index(['Description', 'Segment'])

   print(df)

Output:

                                                  $
    Description         Segment                    
    Total                               1.60528e+08
     Brand A                            3.67197e+07
      Brand A Product 1 Awsome product       107493
      Brand A Product 2 Awsome product             

Upvotes: 1

Related Questions