Reputation: 7490
I have a dataset like below:
i_id q_id week_month
0 1 NaN 27-7
1 2 NaN 27-7
2 3 NaN 27-7
3 4 1.0 27-7
4 5 2.0 27-7
I want to basically group by the variable Week_month and get the corresponding ratio of (q_id/i_id_) for that group. Like for above we would have something like
week_month rate
27-7 2/5
since q_id has only two valid ids and i_id has 5. So rate is defined as q_id/i_id.
I can do this in sql easily, but I don't want to use Pandsql for this. Any pandas way?
Upvotes: 1
Views: 253
Reputation: 29710
If I understand correctly, you could just use apply
with count
, as count
will get the total number of non-NA values in each column for each group.
df.groupby('week_month').apply(lambda x: x.q_id.count()/x.i_id.count())
Demo
>>> df.groupby('week_month').apply(lambda x: x.q_id.count()/x.i_id.count())
week_month
27-7 0.4
dtype: float64
This of course isn't making any assumptions about the actual contents of each group besides non-NA / NA for validity.
Upvotes: 1