The Great
The Great

Reputation: 7713

How to split the dataframe and store it in multiple sheets of a excel file

I have a dataframe like as shown below

import numpy as np
import pandas as pd
from numpy.random import default_rng
rng = default_rng(100)
cdf = pd.DataFrame({'Id':[1,2,3,4,5],
                   'customer': rng.choice(list('ACD'),size=(5)),
                   'region': rng.choice(list('PQRS'),size=(5)),
                   'dumeel': rng.choice(list('QWER'),size=(5)),
                   'dumma': rng.choice((1234),size=(5)),
                   'target': rng.choice([0,1],size=(5))
})

I would like to do the below

a) extract the data for unique combination of region and customer. Meaning groupby.

b) store them in each sheet of one excel file (based on number of groups)

I was trying something like below but there should be some neat pythonic way to do this

df_list = []
grouped = cdf.groupby(['customer','region'])
for k,v in grouped:
    for i in range(len(k)):
        df = cdf[(cdf['customer']==k[i] & cdf['region']==k[i+1])]
        df_list.append(df)

I expect my output to be like below (showing in multiple screenshots).

As my real data has 200 columns and million rows, any efficient and elegant approach would really be helpful

enter image description here

enter image description here

enter image description here

Upvotes: 1

Views: 1108

Answers (1)

jezrael
jezrael

Reputation: 862661

Use this solution in loop:

writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')
    
for (cust, reg), v in cdf.groupby(['customer','region']):
    v.to_excel(writer, sheet_name=f"DATA_{cust}_{reg}")
        
    # Close the Pandas Excel writer and output the Excel file.
writer.save()

Upvotes: 3

Related Questions