Dealing with the Multiindex headers dataFrame - Python

Question

I have a dataframe which looks like this. In the header, it have 2 lines of header, like one heading in row 1 cover 5 subheaddings in row 2.

	     Toneladas planta, ton		
  Fecha 	Fecha 	Ton  SAG1	Ton1
1_2017	1/1/2017	827	1309	2195
1_2017	1/2/2017	913	1343	2222
1_2017	1/3/2017	887	1435	2272
1_2017	1/4/2017	877	1388	2151
1_2017	1/5/2017	900	1236	2177
1_2017	1/6/2017	797	1201	2012
1_2017	1/7/2017	751	1215	2109
1_2017	1/8/2017	851	1241	2109
1_2017	1/9/2017	917	1408	2303
1_2017	1/10/2017	864	1529	2414
1_2017	1/11/2017	911	1560	2383

The values in row 2 are of value to me. But when i am applying functions on the data frame, it is unable to identify the row 2 headers and gives false values. For example: df.info() like functions, gives the wrong values.

I would like to know, if there is a way, i can either mix the two headers to one, in a way that that row 1 becomes the prefix and row 2 as suffix of the shares heading.

Like : Toneladas planta, ton Fecha Fecha

becomes: Toneladas planta, ton Fecha Toneladas planta, ton Fecha

as otherwise its too difficult to work with the dataframe.

Thanks

Trenton McKinney · Accepted Answer

Given the following Excel Sheet:

Just specify the header row with the header parameter, and usecols to get the correct columns

df = pd.read_excel('file.xlsx', header=3, usecols='A:E')

  Fecha    Fecha.1    Ton SAG1     Ton SAG2  Ton  Planta
 1_2017 2017-01-01  826.555503  1308.834944  2194.939490
 1_2017 2017-01-02  912.653670  1343.165048  2221.776328
 1_2017 2017-01-03  886.866000  1434.944123  2272.475950
 1_2017 2017-01-04  877.476604  1388.086279  2150.790596
 1_2017 2017-01-05  900.459985  1236.101284  2177.152583

Dealing with the Multiindex headers dataFrame - Python

Answers (1)

Given the following Excel Sheet:

Related Questions