Reputation: 101
Below is a small sample of a dataframe I have, and I want to add a calculated row to the bottom of it:
sch q1 q2 q3
acc Yes Yes No
acc Yes No No
acc Yes No No
acc Yes Yes Yes
I want to add a row at the bottom that will give me the percentage of values that are 'Yes' for each column, so that it would look like below.
sch q1 q2 q3
acc Yes Yes No
acc Yes No No
acc Yes No No
acc Yes Yes Yes
acc 1.00 0.5 0.25
Any help would be greatly appreciated.
Upvotes: 3
Views: 4614
Reputation: 1276
I see your lambda and raise a pure pandas solution:
df.append(df.eq('Yes').mean(), ignore_index=True)
You don't specify what should happen to the sch
column, so I ignored it. In my current solution this column will get the value 0
.
Upvotes: 3
Reputation: 210852
assume the following approach:
In [11]: df.loc[len(df)] = ['acc'] + df.filter(regex='^q\d+') \
.eq('Yes').mean().values.tolist()
In [12]: df
Out[12]:
sch q1 q2 q3
0 acc Yes Yes No
1 acc Yes No No
2 acc Yes No No
3 acc Yes Yes Yes
4 acc 1 0.5 0.25
Upvotes: 2
Reputation: 153460
Let's use pd.concat
, mean
, to_frame
, and T for transpose.
pd.concat([df,df.replace({'Yes':True,'No':False}).mean().to_frame().T.assign(sch='acc')])
Output:
q1 q2 q3 sch
0 Yes Yes No acc
1 Yes No No acc
2 Yes No No acc
3 Yes Yes Yes acc
0 1 0.5 0.25 acc
Upvotes: 1
Reputation: 6114
df.append(df.apply(lambda x: len(x[x=='Yes'])/len(x)),ignore_index=True)
Output:
q1 q2 q3
0 Yes Yes No
1 Yes No No
2 Yes No No
3 Yes Yes Yes
4 1 0.5 0.25
Upvotes: 1