Reputation: 700
I am reading with pandas excel sheets like this one:
using
df = pd.read_excel('./question.xlsx', sheet_name = None, header = [0,1])
which results in multiindex dataframe with multiindex.
What poses a problem here is that the empty fields are filled by default with 'Title'
, whereas I would prefer to use a distinct label. I cannot skip the first row since I am dealing with bigger data frames where the first and the second rows contain repeating labels (hence the use of the multiindex).
Your help will be much appreciated.
Upvotes: 0
Views: 441
Reputation: 149095
Assuming that you want to have empty strings instead of repeating the first label, you can read the 2 lines and build the MultiIndex directly:
df1 = pd.read_excel('./question.xlsx', header = None, nrows=2).fillna('')
index = pd.MultiIndex.from_arrays(df1.values)
it gives:
MultiIndex([('Title', '#'),
( '', 'Price'),
( '', 'Quantity')],
)
By the way, if you wanted a different label for empty fields, you can just use it as the parameter for fillna
.
Then, you just read the remaining data, and set the index by hand:
df1 = pd.read_excel('./question.xlsx', header = None, skiprows=2)
df1.columns = index
Upvotes: 1