SilvaC
SilvaC

Reputation: 141

Adding labels to outliers for a single boxplots using geom_text() in ggplot2

I have a single boxplot in R of percentage correct values (y-axis), with each point on the plot representing a different participant. I want to label my three outliers with the participant ID (Pt_ID). I created a data frame that includes a column $outlier to label these.

#Create function to identify outliers in terms of % correct
findoutlier <- function(x) {
  return(x < quantile(x, .25) - 1.5*IQR(x) | x > quantile(x, .75) + 1.5*IQR(x))
}

#Add a column to identify which participants are outliers
performance_tibble <- performance_tibble %>%
        mutate(outlier = ifelse(findoutlier(performance_tibble$Perc_Correct), Pt_ID, NA))

#Plot boxplot of %correct including outliers labelled with Pt_ID
ggplot(performance_tibble)+geom_boxplot(aes(y=Perc_Correct), outlier.colour= "red")+theme(axis.text.x = element_blank(), axis.ticks.x= element_blank())

I have looked at other posts and have tried using +geom_text(aes(label=outlier), but this states that I need x and y aesthetics (and I only have a y variable as it is a single boxplot). Can anyone suggest how the labelling of these outliers can be achieved without needing to specify an x aesthetic?

I have added an image of the boxplot with the outliers in red.

Upvotes: 0

Views: 440

Answers (1)

Miff
Miff

Reputation: 7941

You'll need to add a dummy value for the x value, and it's easier to move the y value into ggplot() so that it is used by all the layers. The only other change is to get rid of the x label that then appears. That gives (with some dummy data):

findoutlier <- function(x) {
  return(x < quantile(x, .25) - 1.5*IQR(x) | x > quantile(x, .75) + 1.5*IQR(x))
}

#Add a column to identify which participants are outliers
set.seed(0)
performance_tibble <- tibble(Perc_Correct = -rlnorm(30), Pt_ID=sample(1:3, 30, TRUE))

performance_tibble <- performance_tibble %>%
  mutate(outlier = ifelse(findoutlier(performance_tibble$Perc_Correct), Pt_ID, NA))

#Plot boxplot of %correct including outliers labelled with Pt_ID
ggplot(performance_tibble, aes(y=Perc_Correct, x=1)) + geom_boxplot(outlier.colour= "red")+
  geom_text(aes(label=outlier), nudge_x=0.01) +
  theme(axis.text.x = element_blank(), 
        axis.ticks.x= element_blank(),
        axis.title.x = element_blank())

Output plot with point labels

Upvotes: 1

Related Questions