Reputation: 408
I have a DataFrame like below:
I want to create new column "merge" to concat the distinct text from column "note" when grouped by using column "class" and "subject":
I am new to Python, could someone please help me with a clean way to achieve this?
Thanks.
Upvotes: 0
Views: 730
Reputation: 3041
For example, if you start with:
0_x 0_y
A
a 3.0 2.0
b 2.0 0.0
c 0.0 2.0
Merging strings is easy
Starting with
df['merged'] = df['0_x'].astype(str)+'|'+df['0_y'].astype(str)
results in
0_x 0_y merged
A
a 3.0 2.0 3.0|2.0
b 2.0 0.0 2.0|0.0
c 0.0 2.0 0.0|2.0
If you want something more complex then using the .apply
method to execute a function across two columns works well.
def combine_cols(x):
if x['0_x']==0:
return str(x['0_x'])
else:
return str(x['0_x'])+'|'+str(x['0_y'])
df['special_merge'] = df.apply(combine_cols,axis=1)
results in
0_x 0_y merged special_merge
A
a 3.0 2.0 3.0|2.0 3.0|2.0
b 2.0 0.0 2.0|0.0 2.0|0.0
c 0.0 2.0 0.0|2.0 0.0
Upvotes: 1
Reputation: 1000
You can try using join() on the set of unique values on the note column after groupby.
df['merged'] = df.groupby(['class', 'subject']).note.transform(lambda x : "|".join(list(set(x))))
Upvotes: 1