Reputation: 109
I have a dataframe as shown below:
col1 = ['a','b','c','a','c','a','b','c','a']
col2 = [1,1,0,1,1,0,1,1,0]
df2 = pd.DataFrame(zip(col1,col2),columns=['name','count'])
name count
0 a 1
1 b 1
2 c 0
3 a 1
4 c 1
5 a 0
6 b 1
7 c 1
8 a 0
I am trying to count the number of 0s and 1s corresponding to each element in the 'name' column. So the expected output would look like:
name zero_count one_count
a 2 2
b 0 2
c 1 2
So far I tried many scenarios and one that looked promising was:
ser = df2.groupby(['name','count']).size().to_frame().reset_index()
ser
name count 0
0 a 0 2
1 a 1 2
2 b 1 2
3 c 0 1
4 c 1 2
What further things can I try to fix this?
Upvotes: 1
Views: 66
Reputation: 76
#count zeros:
df2.groupby(['name']).agg(lambda x: x.eq(0).sum())
#count ones:
df2.groupby(['name']).agg(lambda x: x.eq(1).sum())
Upvotes: 1
Reputation: 34056
One-liner:
In [982]: df2.groupby(['name','count']).size().reset_index().pivot('name', 'count')
Out[982]:
0
count 0 1
name
a 2.0 2.0
b NaN 2.0
c 1.0 2.0
Explanation step-wise:
In [950]: res = df2.groupby(['name','count']).size().reset_index(name='counts')
In [958]: out = res.pivot(index='name', columns='count', values='counts').fillna(0)
In [959]: out.columns = ['zero_count', 'one_count']
In [960]: out
Out[960]:
zero_count one_count
name
a 2.0 2.0
b 0.0 2.0
c 1.0 2.0
Upvotes: 1
Reputation: 323276
Try crosstab
pd.crosstab(df2['name'], df2['count'])
Out[40]:
count 0 1
name
a 2 2
b 0 2
c 1 2
Upvotes: 4