Reputation: 418
I have the following dataframe
Name Area
0 Emmeline G
1 Erek L
2 Perrine H
3 Donelle K
4 Nichols E
5 Corinne B
6 Emilia A
7 Dierdre G
8 Hadrian K
9 Tyson B
10 Emmeline D
11 Wynne L
12 Luigi H
13 Martelle J
14 Nichols G
15 Nichols D
16 Tyson G
17 Perrine D
18 Tyson C
19 Martelle C
And I want to join rows that have the same name. Thus, the final dataframe must look like
Name Area
0 Emmeline GD
1 Erek L
2 Perrine HD
3 Donelle K
4 Nichols EGD
5 Corinne B
6 Emilia A
7 Dierdre G
8 Hadrian K
9 Tyson BGC
10 Wynne L
11 Luigi H
12 Martelle JC
I believe I could do this by mixing groupby with join, but I am a little confused on how exactly to do this. Any suggestions?
Upvotes: 1
Views: 152
Reputation: 150745
Another option is groupby().apply
:
# `as_index` and `sort` options are to match the order in expected output
df.groupby('Name', as_index=False, sort=False)['Area'].apply(''.join)
Output:
Name Area
0 Emmeline GD
1 Erek L
2 Perrine HD
3 Donelle K
4 Nichols EGD
5 Corinne B
6 Emilia A
7 Dierdre G
8 Hadrian K
9 Tyson BGC
10 Wynne L
11 Luigi H
12 Martelle JC
Upvotes: 0
Reputation: 1614
Please try groupby and sum.
df.groupby(by="Name").sum().reset_index()
Upvotes: 4
Reputation: 26676
Use groupby()
, and .str.cat
in an agg
function
df.groupby('Name')['Area'].agg(lambda x: x.str.cat()).to_frame('Area')
Area
Name
Corinne B
Dierdre G
Donelle K
Emilia A
Emmeline GD
Erek L
Hadrian K
Luigi H
Martelle JC
Nichols EGD
Perrine HD
Tyson BGC
Wynne L
Upvotes: 1