Reputation: 10011
Suppose we have an excel file (you can download sample data from this link), which contains several sheets with three-layer headers. At present, the data is divided into different sheets according to the Indicator Name
.
Country Name Mexico Moldova
0 Indicator Name PPP_GDP PPP_GDP
1 Indicator Code NY.GDP.PCAP.PP.CD NY.GDP.PCAP.PP.CD
2 1988 NaN NaN
3 1989 8.9 NaN
4 1990 9.9 102.5
5 1991 9.9 103.4
6 1992 9.8 105.4
7 1993 9.7 101.6
8 1994 9.7 101.2
9 1995 9.6 100.2
10 1996 9.5 99.8
11 1997 9.4 99.2
12 1998 9.2 99.3
13 1999 9.0 99.4
14 2000 NaN NaN
Now I hope to re-arrange them into different sheets according to the Country Name
, how to do it?
My attempt is to loop each sheet, merge them horizontally into a large sheet, and then regroup it according to the Country Name
.
sheets = [PPP_GDP, CPI, PPI]
final = []
for sheet in sheets:
df = pd.read_excel('./test_data_2022-10-25.xlsx', sheet_name=sheet, header=[0, 1, 2])
final.append(df)
excel_merged = pd.concat(final, ignore_index=True)
excel_merged.to_excel('./output.xlsx')
Out:
So the question is how could we split dataframe to four sheets as follows:
Note that I gave the expected result in the sample data: a new excel file containing Mexico
, Moldova
, Nepal
and Israel
as sheets.
Upvotes: 0
Views: 68
Reputation: 862611
Create index by first column by index_col=0
in read_excel
, join by pd.concat
with axis=1
and last create sheetnames by country names form first level of MultiIndex:
sheets = ['PPP_GDP', 'CPI', 'PPI']
df = pd.concat([pd.read_excel('./test_data_2022-10-25.xlsx',
sheet_name=sheet,
header=[0, 1, 2],
index_col=0)
for sheet in sheets], axis=1)
writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
for name, g in df.groupby(level=0, axis=1):
g.to_excel(writer, sheet_name=name)
writer.save()
Upvotes: 1