Prakhar Mehrotra
Prakhar Mehrotra

Reputation: 1231

Mean excluding outliers using dplyr

I was wondering if there is a way to compute the mean excluding outliers using the dplyr package in R? I was trying to do something like this, but did not work:

library(dplyr)
w = rep("months", 4)
value = c(1, 10, 12, 9)
df = data.frame(w, value)
output = df %>% group_by(w) %>% summarise(m = mean(value, na.rm = T, outlier = T))

So in above example, output should be 10.333 (mean of 10, 12, & 9) instead of 8 (mean of 1, 10, 12, 9)

Thanks!

Upvotes: 2

Views: 5503

Answers (1)

jazzurro
jazzurro

Reputation: 23574

One way would be something like this using the outlier package.

library(outliers) #containing function outlier
library(dplyr)

df %>%
    group_by(w) %>%
    filter(!value %in% c(outlier(value))) %>%
    summarise(m = mean(value, na.rm = TRUE))

#       w        m
#1 months 10.33333

Upvotes: 9

Related Questions