Reputation: 68
I have an Excel file that looks like the following:
that I'm reading in with pandas as follows:
data = pd.read_excel(excelFile,
sheetname=sheet,
skiprows=[0,1],
header=[0],
index_col=list(range(0,2)))
The problem is that I have blank index values (highlighted in yellow) that pandas ends up filling down with the index column names like so:
How can I prevent it from doing this?
Upvotes: 0
Views: 244
Reputation: 4004
You can achieve the desired outcome if you fill the empty cells in your file with a space character. Alternatively, you can specify fewer options when reading your file and use simple post-processing:
import pandas as pd
df = pd.read_excel('C:\problem.xlsx', skiprows = [0, 1])
# Replace NaNs with a space character
df = df.fillna(' ')
df = df.set_index(['Description', 'Segment'])
print(df)
Output:
$
Description Segment
Total 1.60528e+08
Brand A 3.67197e+07
Brand A Product 1 Awsome product 107493
Brand A Product 2 Awsome product
Upvotes: 1