ah bon
ah bon

Reputation: 10011

Recombine sheets with multiple level of headers, and write out an excel file with different sheets named by one level of header

Suppose we have an excel file (you can download sample data from this link), which contains several sheets with three-layer headers. At present, the data is divided into different sheets according to the Indicator Name.

      Country Name             Mexico            Moldova
0   Indicator Name            PPP_GDP            PPP_GDP
1   Indicator Code  NY.GDP.PCAP.PP.CD  NY.GDP.PCAP.PP.CD
2             1988                NaN                NaN
3             1989               8.9                 NaN
4             1990               9.9              102.5 
5             1991               9.9              103.4 
6             1992               9.8              105.4 
7             1993               9.7              101.6 
8             1994               9.7              101.2 
9             1995               9.6              100.2 
10            1996               9.5               99.8 
11            1997               9.4               99.2 
12            1998               9.2               99.3 
13            1999               9.0               99.4 
14            2000                NaN                NaN

Now I hope to re-arrange them into different sheets according to the Country Name, how to do it?

My attempt is to loop each sheet, merge them horizontally into a large sheet, and then regroup it according to the Country Name.

sheets = [PPP_GDP, CPI, PPI]
final = []
for sheet in sheets:
    df = pd.read_excel('./test_data_2022-10-25.xlsx', sheet_name=sheet, header=[0, 1, 2])
    final.append(df)
    
excel_merged = pd.concat(final, ignore_index=True)
excel_merged.to_excel('./output.xlsx')

Out:

enter image description here

So the question is how could we split dataframe to four sheets as follows:

enter image description here

enter image description here

enter image description here

enter image description here

Note that I gave the expected result in the sample data: a new excel file containing Mexico, Moldova, Nepal and Israel as sheets.

Upvotes: 0

Views: 68

Answers (1)

jezrael
jezrael

Reputation: 862611

Create index by first column by index_col=0 in read_excel, join by pd.concat with axis=1 and last create sheetnames by country names form first level of MultiIndex:

sheets = ['PPP_GDP', 'CPI', 'PPI']
df = pd.concat([pd.read_excel('./test_data_2022-10-25.xlsx', 
                              sheet_name=sheet, 
                              header=[0, 1, 2], 
                              index_col=0) 
                 for sheet in sheets], axis=1)


writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')

for name, g in df.groupby(level=0, axis=1):
    g.to_excel(writer, sheet_name=name)

writer.save()

Upvotes: 1

Related Questions