coder_bg
coder_bg

Reputation: 370

How to split values in a column after using groupby in Pandas?

I have used the following code to display top 3 products of each retailer

baseline_df = baseline_df.groupby(['RtrName'])['Product'].agg(lambda x:x.value_counts().index[0:3]).to_frame().reset_index()

The result of this is a dataframe with two columns - RtrName and Product but each row in the Product column has 3 values split by a comma (,)

Result Output:

RtrName  Product
RtrName_a Product1,Product2,Product3
RtrName_b Product1,Product2,Product3

I want to have my dataframe in a way like : Expected Output

RtrName_a  Product1
           Product2
           Product3 
RtrName_b  Product1
           Product2
           Product3

How do I achieve this?

Upvotes: 0

Views: 165

Answers (2)

coder_bg
coder_bg

Reputation: 370

So I tried it this way and achieved what I wanted.

def func(x):
    x = x[:10]
    return x['Product']
    
temp = baseline_df.groupby('RtrName').apply(lambda x: func(x))
temp = temp.to_frame().reset_index().drop(columns=['level_1'])

Upvotes: 0

jezrael
jezrael

Reputation: 862751

Use DataFrame.explode of splitted values by Series.str.split before your solution:

baseline_df = (baseline_df.assign(Product = baseline_df.Product.str.split(','))
                          .explode('Product')
                          .groupby(['RtrName'])['Product']
                          .agg(lambda x:x.value_counts().index[:3])
                          .reset_index(name='new'))

Different output is if use explode after groupby:

baseline_df = (baseline_df.groupby(['RtrName'])['Product']
                          .agg(lambda x:x.value_counts().index[:3])
                          .str.split(',')
                          .explode()
                          .reset_index(name='Product')) 

Upvotes: 2

Related Questions