Reputation: 107
I am trying to generate a unique group-value for each observation made up of the contents of a column concatenated together, while keeping all the rows intact.
I have observations that can be grouped on a specific column (column A
below). I want to create a unique value per group made of the content of each row of this group, but keeping the rows untouched.
I have tried solutions provided here and here, but these solutions collapse the results, leaving one row per group, whereas I wish to keep all rows.
import pandas as pd
d = {'A': [1, 2, 3, 3, 4, 5, 5, 6],
'B': [345, 366, 299, 455, 879, 321, 957, 543]}
df = pd.DataFrame(d)
print(df)
A B
0 1 345
1 2 366
2 3 299
3 3 455
4 4 879
5 5 321
6 5 957
7 5 689
8 6 543
df['B'] = df['B'].astype(str)
df['B_concat'] = df.groupby(['A'])['B'].apply('/'.join)
print(df)
A B B_concat
0 1 345 NaN
1 2 366 345
2 3 299 366
3 3 455 299/455
4 4 879 879
5 5 321 321/957/689
6 5 957 543
7 5 689 NaN
8 6 543 NaN
Units in the same group should have the same B_concat
value.
A B B_concat
0 1 345 345
1 2 366 366
2 3 299 299/455
3 3 455 299/455
4 4 879 879
5 5 321 321/957/689
6 5 957 321/957/689
7 5 689 321/957/689
8 6 543 543
Upvotes: 1
Views: 43
Reputation: 863431
Use GroupBy.transform
for return Series
with same size like original DataFrame
, so possible assign to new column:
df['B'] = df['B'].astype(str)
df['B_concat'] = df.groupby(['A'])['B'].transform('/'.join)
One line solution should be:
df['B_concat'] = df['B'].astype(str).groupby(df['A']).transform('/'.join)
print (df)
A B B_concat
0 1 345 345
1 2 366 366
2 3 299 299/455
3 3 455 299/455
4 4 879 879
5 5 321 321/957
6 5 957 321/957
7 6 543 543
Or:
df['B_concat'] = df.groupby(['A'])['B'].transform(lambda x: '/'.join(x.astype(str)))
Upvotes: 1