Reputation: 420
library(ggplot2)
A <- c(rep(LETTERS[1:5],2))
B <- rep(c("one", "two"),5)
set.seed(200)
C <- round(rnorm(10),2)
dff <- data.frame(A,B,C)
dff
ggplot(dff, aes(x=B, y=C, fill=B)) +
geom_boxplot()
Is it possible to use A to label the outliers?
Upvotes: 0
Views: 256
Reputation: 525
Here's a solution to label only the outliers in your data:
library(tidyverse)
outlier <- dff %>%
group_by(B) %>%
summarise(outlier = list(boxplot.stats(C)$out))
ggplot(dff, aes(x=B, y=C, fill=B)) +
geom_boxplot() +
geom_text(aes(label = if_else(C %in% unlist(outlier$outlier), as.character(A), "")), position=position_nudge(x=-.1))
which produces this plot:
Upvotes: 1
Reputation: 420
I edited the second answer in the question suggested in the first comment to suit my case.
is_outlier <- function(x) {
return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}
dat <- dff %>% tibble::rownames_to_column(var="outlier") %>% group_by(factor(B)) %>%
mutate(is_outlier=ifelse(is_outlier(C), C, as.numeric(NA)))
dat$outlier[which(is.na(dat$is_outlier))] <- as.numeric(NA)
ggplot(dat, aes(y=C, x=factor(B),fill=factor(B))) +
geom_boxplot() +
geom_text(aes(label=dat$A[dat$is_outlier != "NA"]),na.rm=TRUE,nudge_y=0.05)
Might not be the best answer :D
Upvotes: 1