Reputation: 687
I have this excel file that I would like to "normalize" with the below code by dropping the A and B columns and then dropping the Rows 1,2,3,5 and making the row 4 as header.
df_weekly = pd.read_excel('Run_report.xlsm', sheet_name= "weekly")
df_weekly = df_weekly.drop(df_weekly.columns[[0,1]], axis=1)
df_weekly = df_weekly.drop(df_weekly.index[[0,1,2,4]])
print(df_weekly.head(8))
But I keep getting this weird df that I dont understand the reason why
Unnamed: 2 Unnamed: 3 ... Unnamed: 53 Unnamed: 54
3 NaN NaN ... NaN NaN
5 EU01 10 ... 5400537629432 NaN
6 EU01 10 ... 5400599118677 NaN
7 EU01 10 ... 5400599027382 NaN
8 EU01 10 ... 5400599028730 NaN
9 EU01 10 ... 5400599028761 NaN
10 EU01 10 ... 5400599613554 NaN
11 EU01 10 ... 5400599028945
Also:
top= df_weekly.columns.values.tolist()
print(top)
['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8',....]
Upvotes: 0
Views: 218
Reputation: 148870
You must explicitely set the column names:
df_weekly = pd.read_excel('Run_report.xlsm', sheet_name= "weekly",
header=None)
df_weekly = df_weekly.drop(df_weekly.columns[[0,1]], axis=1)
df_weekly.columns = df_weekly.loc[3].rename(None)
df_weekly = df_weekly.drop(range(5))
print(df_weekly.head(8))
Upvotes: 1