Reputation: 309
With the following df:
Real_time SName VT on Act_win ratio
11:11:20 x 5 yes zoom 90
11:11:30 x 15 yes chrome 90
11:11:40 x 25 yes zoom 90
11:11:50 x 35 no zoom 90
11:12:00 x 45 yes zoom 90
11:12:10 x 55 yes zoom 90
11:12:20 x 65 yes zoom 50
11:12:30 x 75 no zoom 60
11:12:40 x 85 yes chrome 60
11:12:50 x 95 yes zoom 60
11:13:00 x 105 yes zoom 20
11:11:20 y 5 yes zoom 99
11:11:30 y 15 yes zoom 88
11:11:40 y 25 yes chrome 77
11:11:50 y 35 no chrome 49
11:12:00 y 45 yes zoom 56
11:12:10 y 55 yes zoom 46
11:12:20 y 65 yes zoom 90
11:12:30 y 75 no zoom 90
11:12:40 y 85 yes zoom 90
11:12:50 y 95 yes zoom 90
11:13:00 y 105 yes chrome 50
I need to group by timestamp and act_win and count instances in which df["act_win"] = "zoom"
per timestamp, where 0 occurrences should be counted as 0 (and not have the row dropped).
In addition, I would like to use "mean" for "ratio" on the same condition.
Lastly, I need to do the same thing with multiple conditions, given df["act_win"] = "zoom"
and df["on"] = "yes"
.
I got stuck because I just do a total count and then drop unnecessary rows. However if there where no "zoom" entries in a specific timestamp, it won't work. Unfortunately the answers I have found so far create a factorial column for months which doesn't help me in this situation.
EDIT -
expected result for final combined criteria (both df["on"] = "yes"
and df["act_win"] = "zoom"
):
Real_time count mean_r
11:11:20 2 94.5
11:11:30 1 88
11:11:40 1 90
11:11:50 0 0
11:12:00 2 73
11:12:10 2 68
11:12:20 2 70
11:12:30 0 0
11:12:40 1 90
11:12:50 2 75
11:13:00 1 20
Upvotes: 1
Views: 135
Reputation: 35115
There may be a better way to do this, but it is calculated by grouping the results against the conditional extraction results. Then I create a data frame from it with a unique timestamp and combine it with
df1 = df[(df['Act_win'] == 'zoom')&(df['on'] == 'yes')].groupby(['Real_time','Act_win']).agg(count=('ratio','count'), mean_r=('ratio','mean'))
df1.reset_index(inplace=True)
df2 = pd.DataFrame({'Real_time':df['Real_time'].unique().tolist()})
df2 = df2.merge(df1, on='Real_time', how='outer')
df2['Act_win'] = df2['Act_win'].fillna(method='ffill')
df2.fillna(0, inplace=True)
df2
Real_time Act_win count mean_r
0 11:11:20 zoom 2.0 94.5
1 11:11:30 zoom 1.0 88.0
2 11:11:40 zoom 1.0 90.0
3 11:11:50 zoom 0.0 0.0
4 11:12:00 zoom 2.0 73.0
5 11:12:10 zoom 2.0 68.0
6 11:12:20 zoom 2.0 70.0
7 11:12:30 zoom 0.0 0.0
8 11:12:40 zoom 1.0 90.0
9 11:12:50 zoom 2.0 75.0
10 11:13:00 zoom 1.0 20.0
Upvotes: 1