Reputation: 2155
I have a data frame:
A B C
0 xx No 1
1 xx No 2
2 xx Yes 3
3 xx Yes 4
4 xx No 1
5 xx No 2
6 xx No 3
7 xx Yes 4
8 xx No 1
9 xx No 2
If want to create a column that counts the number of times a number appear in column C, when column B is equal to Yes
so the output would be
A B C countifs
0 xx No 1
1 xx No 2
2 xx Yes 3 1
3 xx Yes 4 2
4 xx No 1
5 xx No 2
6 xx No 3
7 xx Yes 4 2
8 xx No 1
9 xx No 2
Is this possible?
Upvotes: 2
Views: 274
Reputation: 7625
Here is another approach, using numpy.where
, .value_counts()
, and .map()
:
>>> col_C_cnts = df[df.B == 'Yes']['C'].value_counts()
>>> df['countifs'] = np.where(df.B == 'Yes', df['C'].map(col_C_cnts), pd.np.nan)
>>> print(df.fillna(''))
A B C countifs
0 xx No 1
1 xx No 2
2 xx Yes 3 1
3 xx Yes 4 2
4 xx No 1
5 xx No 2
6 xx No 3
7 xx Yes 4 2
8 xx No 1
9 xx No 2
Upvotes: 1
Reputation: 862511
For count only True
s values use GroupBy.transform
with sum
only for filtered rows by mask:
m = df['B'].eq('Yes')
df['countifs'] = m.groupby(df.loc[m, 'C']).transform('sum').fillna('')
print (df)
A B C countifs
0 xx No 1
1 xx No 2
2 xx Yes 3 1
3 xx Yes 4 2
4 xx No 1
5 xx No 2
6 xx No 3
7 xx Yes 4 2
8 xx No 1
9 xx No 2
Upvotes: 4
Reputation: 88236
Here's one approach:
y = df.B.eq('Yes')
df.loc[y, 'countifs'] = df[y].groupby('C').B.transform('count').values
print(df.fillna(''))
A B C countifs
0 xx No 1
1 xx No 2
2 xx Yes 3 1
3 xx Yes 4 2
4 xx No 1
5 xx No 2
6 xx No 3
7 xx Yes 4 2
8 xx No 1
9 xx No 2
Upvotes: 3