Baktaawar
Baktaawar

Reputation: 7490

Grouping the ratio by a pandas column

I have a dataset like below:

    i_id    q_id    week_month
0   1       NaN      27-7
1   2       NaN      27-7
2   3       NaN      27-7
3   4       1.0      27-7
4   5       2.0      27-7

I want to basically group by the variable Week_month and get the corresponding ratio of (q_id/i_id_) for that group. Like for above we would have something like

week_month   rate
27-7         2/5

since q_id has only two valid ids and i_id has 5. So rate is defined as q_id/i_id.

I can do this in sql easily, but I don't want to use Pandsql for this. Any pandas way?

Upvotes: 1

Views: 253

Answers (1)

miradulo
miradulo

Reputation: 29710

If I understand correctly, you could just use apply with count, as count will get the total number of non-NA values in each column for each group.

df.groupby('week_month').apply(lambda x: x.q_id.count()/x.i_id.count())

Demo

>>> df.groupby('week_month').apply(lambda x: x.q_id.count()/x.i_id.count())

week_month
27-7    0.4
dtype: float64

This of course isn't making any assumptions about the actual contents of each group besides non-NA / NA for validity.

Upvotes: 1

Related Questions