How do I find median and mean of certain values in a column?

Question

I have a large csv file, and I am trying to find the median and the mean values of certain values in a column. One of my columns is titled 'Race' and another is called 'debt_to_income_ratio'. Within the Race column, the four options are 'White', 'Black', 'Hispanic', and 'Other'. The 'debt_to_income_ratio' column has a number in it indicating the debt to income ratio of whatever the race is in the 'Race' column. I am trying to get a median and mean debt to income ratio for each race (white, black, hispanic, and other).

The code I am currently using is:

df['race average'] = df.groupby('Race')['debt_to_income_ratio'].transform('mean') %>%
df['race median'] = df.groupby('Race')['debt_to_income_ratio'].transform('median')

I'm not really sure what I should be doing, so thanks in advance for any help!

akrun · Accepted Answer

We can use dplyr to do this

library(dplyr)
df %>%
    group_by(Race) %>%
    mutate(Mean = mean(debt_to_income_ratio, na.rm = TRUE),
           Median = median(debt_to_income_ratio, na.rm = TRUE))

How do I find median and mean of certain values in a column?

Answers (2)

Related Questions