Split dataframe by multiple columns into multiple excels

Question

Consider the following dataframe df:

df = pd.DataFrame(dict(
        Id = [1, 2, 3, 3],
        Country=['jp', 'cn', 'uk', 'uk'],
        Sales = [5, 3, 3, 4]
    ))

print(df)
   Id Country  Sales
0   1      jp      5
1   2      cn      3
2   3      uk      3
3   3      uk      4

I want to iterate and groupby columns Id and Countryand write to excel:

for n, g in df.groupby(['Id', 'Country']):
    print(n)
    # `n` is the group name, which will be the country
    g.to_excel('{}.xlsx'.format(n), index = False)

It will create files: ["(1, 'jp').xlsx", "(2, 'cn').xlsx", "(3, 'uk').xlsx"], but I want the format like: ["1-jp.xlsx", "2-cn.xlsx", "3-uk.xlsx"].

How can I get it in Pandas? Thanks.

jezrael · Accepted Answer

You can unpack tuple to variables a and b and pass to format function:

for (a, b), g in df.groupby(['Id', 'Country']):
    print(a, b)
    # `n` is the group name, which will be the country
    g.to_excel('{}_{}.xlsx'.format(a, b), index = False)

Or select tuples by position by indexing:

for n, g in df.groupby(['Id', 'Country']):
    print(n)
    # `n` is the group name, which will be the country
    g.to_excel(f'{n[0]}_{n[1]}.xlsx'.format(n), index = False)

Split dataframe by multiple columns into multiple excels

Answers (2)

Related Questions