Reputation: 193
I have a dataframe that look like this :
name event_code_0 event_code_1100 event_code_1110 event_code_1120 event_code_1120 event_code_1199
0 A 1 0 0 0 0 0
1 A 0 0 1 0 0 0
2 A 0 0 0 1 0 0
3 A 0 0 1 0 0 0
4 A 0 1 0 0 0 0
5 A 0 0 0 0 1 0
6 A 0 0 0 1 0 0
7 A 0 1 0 0 0 0
8 A 1 0 0 0 0 0
9 B 0 0 0 0 0 1
10 B 0 0 0 0 1 0
11 B 0 0 1 0 0 0
12 B 0 1 0 0 0 0
13 B 1 0 0 0 0 0
14 B 0 0 1 0 0 0
15 B 0 0 0 1 0 0
16 B 0 1 0 0 0 0
17 B 0 0 0 0 0 1
I would like to merge the differents rows on the column : 'name' and count the event code.
For instance the dataframe should look like :
name event_code_0 event_code_1100 event_code_1110 event_code_1120 event_code_1120 event_code_1199
0 A 2 2 2 2 1 0
1 B 1 2 2 1 1 2
I have try different things with .merge but without success. Maybe someone know an easy trick...
Cheers, Astrus
Upvotes: 1
Views: 1195
Reputation: 862641
You need groupby
and aggregate sum
, for name
to column add reset_index
or parameter as_index=False
:
df1 = df.groupby('name').sum().reset_index()
print (df1)
name event_code_0 event_code_1100 event_code_1110 event_code_1120 \
0 A 2 2 2 2
1 B 1 2 2 1
event_code_1120 event_code_1199
0 1 0
1 1 2
Or:
df1 = df.groupby('name', as_index=False).sum()
print (df1)
name event_code_0 event_code_1100 event_code_1110 event_code_1120 \
0 A 2 2 2 2
1 B 1 2 2 1
event_code_1120 event_code_1199
0 1 0
1 1 2
Upvotes: 4
Reputation:
You need to group by the name column and take the sum:
df.groupby('name').sum()
Out:
event_code_0 event_code_1100 event_code_1110 event_code_1120 \
name
A 2 2 2 2
B 1 2 2 1
event_code_1120.1 event_code_1199
name
A 1 0
B 1 2
This, by default, sets the name column as index. You can change that by:
df.groupby('name', as_index=False).sum()
Out:
name event_code_0 event_code_1100 event_code_1110 event_code_1120 \
0 A 2 2 2 2
1 B 1 2 2 1
event_code_1120.1 event_code_1199
0 1 0
1 1 2
or df.groupby('name').sum().reset_index()
.
Upvotes: 2