how to calculate mean and median based on label of a column in python

Question

I have a large data frame which shows similar as follows:

price   type      status
2       shoes      none
3       clothes    none
6       clothes    none
3       shoes      none
4       shoes      none
6       shoes      none
2       clothes    none
3       shoes      none
6       clothes    none
8       clothes    done

Basically, I want to calculate the mean and median based on "type" whenever the "status" is written done. So far what I have done is made a group first based on the status "done", then I calculate the mean and median of the group like the script below:

g = df['status'].eq('done').iloc[::-1].cumsum().iloc[::-1]
grouper = df.groupby(g)
df_statistics = grouper.agg(
               mean = ('price', 'mean')
              ,median = ('price', 'median')
)
df_freq = df.groupby(g).apply(lambda x: x['price'].value_counts().idxmax())

How can I add one more parameter for the "type", so the script will estimate the median of each group based on "type" also.

Thankyou

jezrael · Accepted Answer

I think you need pass column name to list and then to groupby:

grouper = df.groupby([g, 'type'])

how to calculate mean and median based on label of a column in python

Answers (1)

Related Questions