Reputation: 748
I have pandas dataframe of the form,df=
index,result1,result2,result3
0 s u s
1 u s u
2 s
3 s s u
i would like to add another column that contains a list of the number of times s occurs in that row, for example
index,result1,result2,result3,count
0 s u s 2
1 u s u 1
2 s 1
3 s s u 2
i have tried the following code
col=['result1','result2','result3']
df[cols].count(axis=1)
but this returns
0,3
1,3
2,1
3,3
so this counts the number of elements, i then tried
df[df[cols]=='s'].count(axis=1)
but this returned the following error: "Could not compare ['s'] with block values"
Any help would be greatly appreciated
Upvotes: 1
Views: 1705
Reputation: 863236
For me works cast to string
by astype
numeric and NaN
columns return your error
:
print (df)
index result1 result2 result3 result4
0 0 s u 7 NaN
1 1 u s 7 NaN
2 2 s NaN 8 NaN
3 3 s s 7 NaN
4 4 NaN NaN 2 NaN
print (df.dtypes)
index int64
result1 object
result2 object
result3 int64
result4 float64
dtype: object
cols = ['result1','result2','result3','result4']
df['count'] = df[df[cols].astype(str) == 's'].count(axis=1)
print (df)
index result1 result2 result3 result4 count
0 0 s u 7 NaN 1
1 1 u s 7 NaN 1
2 2 s NaN 8 NaN 1
3 3 s s 7 NaN 2
4 4 NaN NaN 2 NaN 0
Or sum
only True
values from boolean mask
:
print (df[cols].astype(str) == 's')
result1 result2 result3 result4
0 True False False False
1 False True False False
2 True False False False
3 True True False False
4 False False False False
cols = ['result1','result2','result3','result4']
df['count'] = (df[cols].astype(str) =='s').sum(axis=1)
print (df)
index result1 result2 result3 result4 count
0 0 s u 7 NaN 1
1 1 u s 7 NaN 1
2 2 s NaN 8 NaN 1
3 3 s s 7 NaN 2
4 4 NaN NaN 2 NaN 0
Another nice solution is from Nickil Maveli - use numpy
:
df['count'] = (df[cols].values=='s').sum(axis=1)
Upvotes: 2