Reputation: 157
I have the following data frame df1
:
id | action |
----|--------|
1 | A |
1 | A |
2 | C |
3 | D |
2 | B |
2 | C |
1 | B |
3 | D |
1 | D |
3 | A |
I want to count the frequency that the value action
appear for each user, and put this value in new column. I want to get the following data frame :
id | action=A | action=B | action=C |
----|----------|----------|----------|
1 | 2 | 1 | 0 |
2 | 0 | 1 | 2 |
3 | 1 | 0 | 0 |
Upvotes: 0
Views: 172
Reputation: 6589
First you should add an other column Count
with the value 1
df['Count'] = 1
df
id action Count
0 1 A 1
1 1 A 1
2 2 C 1
3 3 D 1
4 2 B 1
5 2 C 1
6 1 B 1
7 3 D 1
8 1 D 1
9 3 A 1
Then you can use pivot_table()
to get the desired result.
df.pivot_table('Count', index='id', columns='action', aggfunc='sum', fill_value=0)
action A B C D
id
1 2 1 0 1
2 0 1 2 0
3 1 0 0 2
Upvotes: 2
Reputation: 210832
I've found one solution, but i don't like it:
import six
import numpy as np
import pandas as pd
data = """\
id action
1 A
1 A
2 C
3 D
2 B
2 C
1 B
3 D
1 D
3 A
"""
df = pd.read_csv(six.StringIO(data), sep='\s+')
print(df)
pvt = df.pivot(columns='action', values='id')
pvt['id'] = df['id']
print(pvt.groupby('id').count().reset_index().to_string(index=False))
Output:
id action
0 1 A
1 1 A
2 2 C
3 3 D
4 2 B
5 2 C
6 1 B
7 3 D
8 1 D
9 3 A
id A B C D
1 2 1 0 1
2 0 1 2 0
3 1 0 0 2
PS i hope there will be another, more elegant ones
Upvotes: 0