Reputation: 1665
With the following code for a .xlsx file with a Multiindex in the columns:
import pandas as pd
df = pd.read_excel('Total Data.xlsx', header=[0,1], sheet_name="Control (E)")
c = df.xs('Time', drop_level=False, axis=1, level=1).columns
print(c.levels[1])
I get the following output:
c.levels[1]
Index([' SV (Nexn)', '%CO2', '%O2', 'CO (Nexn)', 'DBP (Calc)',
'EtCO2 (Nonin)', 'HR (Calc)', 'HR (Nexn)', 'MAP (Calc)',
'MAP (Nexn)', 'MV (calc)', 'MV (calc).1', 'RR (Nonin)', 'SBP (Calc)',
'Subject', 'TV (Calc)', 'Time', 'VO2 (calc)', 'Vi'],
dtype='object')
It appears to have something to do with the empty column I have separating the columns at level 0 (namely, Baseline and Stage 1), which subsequently also separates the level 1 columns.
This is a screenshot of the Excel file's multiindex header:
Why is MV (calc).1 showing up as a column when it's not actually in the Excel file, and how can I avoid it from occurring?
Upvotes: 0
Views: 750
Reputation: 39
You most likely have the same column name twice in your excel sheet.
All keys must be unique, which is why Pandas is appending a ".1" to the end of the second occurrence of a duplicate column name.
Upvotes: 3