Amiclone
Amiclone

Reputation: 408

Merge text of all rows in a column in pandas group by

I have a DataFrame like below:

enter image description here

I want to create new column "merge" to concat the distinct text from column "note" when grouped by using column "class" and "subject":

enter image description here

I am new to Python, could someone please help me with a clean way to achieve this?

Thanks.

Upvotes: 0

Views: 730

Answers (2)

willwrighteng
willwrighteng

Reputation: 3041

For example, if you start with:

    0_x 0_y
A       
a   3.0 2.0
b   2.0 0.0
c   0.0 2.0

Merging strings is easy

Starting with

df['merged'] = df['0_x'].astype(str)+'|'+df['0_y'].astype(str)

results in

    0_x 0_y merged
A           
a   3.0 2.0 3.0|2.0
b   2.0 0.0 2.0|0.0
c   0.0 2.0 0.0|2.0

If you want something more complex then using the .apply method to execute a function across two columns works well.

def combine_cols(x):
    if x['0_x']==0:
        return str(x['0_x'])
    else:
        return str(x['0_x'])+'|'+str(x['0_y'])
        
df['special_merge'] = df.apply(combine_cols,axis=1)

results in


    0_x 0_y merged  special_merge
A               
a   3.0 2.0 3.0|2.0 3.0|2.0
b   2.0 0.0 2.0|0.0 2.0|0.0
c   0.0 2.0 0.0|2.0 0.0

Upvotes: 1

Rajesh Bhat
Rajesh Bhat

Reputation: 1000

You can try using join() on the set of unique values on the note column after groupby.

df['merged'] = df.groupby(['class', 'subject']).note.transform(lambda x : "|".join(list(set(x))))

Upvotes: 1

Related Questions