
Reputation: 1616

How can I plot only one text in a jitter plot?

I want to plot only once the text of the mean for the specific cluster.

So I have followed plot: enter image description here

but what I want is this:

enter image description here

code for reproduction:

price_l <- rep(c('€€-€€€', '€€-€€€', '€€€€', '€€-€€€', '€€-€€€', 
             '€€-€€€', '€€€€', '€€-€€€', '€€€€', '€€-€€€', 
             '€€-€€€', '€€-€€€', '€€-€€€', '€€-€€€', 
             '€€-€€€', '€€-€€€', '€€-€€€', '€€-€€€', '€€€€','€', '€', 
             '€', '€','€€€€', '€'),100)

avg_r <- rep(c(4.5, 3.5, 4.0, 4.0, 4.0, 3.5, 4.5, 4.0, 3.0, 4.0, 
           3.0, 5.0, 4.5, 4.0, 3.0,
           3.5, 4.5, 3.5, 3.5, 4.0, 3.0, 4.0, 4.0, 2.5, 4.5),100)

sub.df <- data.frame(price_l, avg_r)

sub.df %>%    
  group_by(price_l) %>%
  mutate(mean = mean(avg_r)) %>%
  ungroup() %>%
  ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T) +
  geom_jitter(aes(colour = price_l)) +
  geom_text(aes(label = sprintf("%.2f",mean)))

Upvotes: 2

Views: 377

Answers (3)


Reputation: 13843

For what it's worth, here's a way to do this using stat_summary(). This has an advantage over the previous method in that: (1) there's no need to summarize beforehand via group_by()... mutate()... functions, and (2) it avoids overplotting that will occur if you use geom_text().

The answer proposed using geom_text() alone works just fine for the result, but you'll note that this will result in overplotting. The reason is that geom_text() like all other geoms will draw "a thing" on the plot for every observation in the dataset. The dataframe resulting from the pipe (%>%) commands above the initial ggplot() call should have 2500 observations. This means that if you ask geom_text() to create a label/text at a specific position, it will do so... 2500 times.

To avoid this, you should do one of two things:

  1. Create a separate dataframe of aggregated data containing only 3 observations (three pieces of text here) and use geom_text(data = that_new_dataframe...), or

  2. Use stat_summary() and have that do all the summarizing for you based on the original dataset, sub.df.

For the stat_summary() method, you can create a userfunction to return a label and y value (satisfying the aesthetics required for geom_text() and then apply that to your dataset within stat_summary() via the fun.data= argument:

my_fun <- function(x){
  return(data.frame(y=mean(x), label=sprintf("%.2f", mean(x))))

sub.df %>%    
  ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T) +
  geom_jitter(aes(colour = price_l)) +
    geom="text", fun.data="my_fun", size=8,

enter image description here

Note: after posting this I realize it's similar to @TarJae's answer... but kept it here due to the further explanation.

Upvotes: 2


Reputation: 79184

We could use stat_summary(aes(label = ..y..), geom = "text", fun = mean, color="black", size = 6, fontface = 2)

sub.df %>%    
  group_by(price_l) %>%
  mutate(mean = mean(avg_r)) %>%
  ungroup() %>%
  ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T) +
  geom_jitter(aes(colour = price_l)) +
  stat_summary(aes(label = ..y..), geom = "text", fun = mean, color="black", size = 6, fontface = 2)

enter image description here

Upvotes: 5

Allan Cameron
Allan Cameron

Reputation: 174348

You can set the y value manually inside the geom_text

sub.df %>%    
  group_by(price_l) %>%
  mutate(mean = mean(avg_r)) %>%
  ungroup() %>%
  ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T) +
  geom_jitter(aes(colour = price_l)) +
  geom_text(aes(y = 3.5, label = sprintf("%.2f",mean)),
            check_overlap = TRUE, size = 6, fontface = 2)

enter image description here

Or, as r2evans suggests:

sub.df %>%    
  group_by(price_l) %>%
  mutate(mean = mean(avg_r)) %>%
  ungroup() %>%
  ggplot(sub.df, mapping=aes(price_l, avg_r), na.rm=T) +
  geom_jitter(aes(colour = price_l)) +
  geom_text(aes(y = mean, label = sprintf("%.2f",mean)),
            check_overlap = TRUE, size = 6, fontface = 2)

enter image description here

Upvotes: 5

Related Questions