Reputation: 541
The following code generates a dummy dataframe:
import pandas as pd
df = pd.DataFrame(
{
'user_id': [1,2,3,1,1],
'account_type': ['google','facebook','apple','facebook','google'],
'activated': ['y','pending','n','y','y']
}
)
df.head()
What I need is to create a pivot table for unique_values in the account_type
column, which aggregates by the user_id
column.
Essentially, for each user_id
, I want to see how many of each types of accounts have a value other than n
under the activated
column.
The resulting dataframe should be:
I'm beat by this so far because I the pivot_table
function seems to only be able to work with numeric values.
Upvotes: 0
Views: 202
Reputation: 862611
Use DataFrame.assign
for new column filled by mask compared activated
for no n
, and then pivoting by DataFrame.pivot_table
with aggfunc=sum
for count True
s:
df = (df.assign(new= df['activated'].ne('n'))
.pivot_table(index='user_id',
columns='account_type',
values='new',
fill_value=0,
aggfunc='sum')
.rename_axis(columns=None))
print (df)
apple facebook google
user_id
1 0 1 2
2 0 1 0
3 0 0 0
Upvotes: 1
Reputation: 24314
Try Via groupby()
, where()
and transform()
method:
df['count']=(df.groupby(['user_id','account_type'])['activated']
.transform(lambda x:x.where(df['activated'].ne('n')).count()))
Finally use pivot_table()
and rename_axis()
method:
result=df.pivot_table(index='user_id',columns='account_type',values='count',fill_value=0).rename_axis(columns=None)
Output of result
:
apple facebook google
user_id
1 0 1 2
2 0 1 0
3 0 0 0
Upvotes: 2