Christopher Robinson
Christopher Robinson

Reputation: 33

gghighlight for labelling specific points in scatterplot

I have a data frame that looks like this:

 rowname  Class Sec    ES.2um Mean_WPBs   ES.2um_ZS   Mean_ES  VWF_Sec    name
       1 Formin HAI 113.37340  147.1792  0.16078492 131.69309 162.5219  DIAPH1
       2 Formin HAI  43.90661  121.9017 -0.11594028  75.37296 137.4212    FMN2
       3 Septin HAI  64.32138  132.7591 -0.16218581  66.23765 195.9011 SEPTIN5
       4 Septin HAI  53.15791  145.7871 -0.86969449  81.92690 187.2647   LRCH3
       5 Arp2/3 HAI  68.67222  161.0516 -0.05404113  82.51804 158.2623   ARPC3
       6 Arp2/3 HAI  71.00643  149.0704 -0.38119473  82.91458 220.5494   WASF3

and am currently using to identify/highlight a class of proteins; look at the code below:

plot_ESZ_lab <-ggplot(df, aes(ES.2um_ZS, VWF_Sec, color = Sec, shape = Sec)) + 
               geom_point(aes(size = Mean_ES)) + 
               scale_size_continuous(range=c(0.5,10))+ 
               scale_color_manual(values=c("HAI" = "blue", "PMA" = "red")) + 
               gghighlight(Class == "Formin", use_direct_label = TRUE, 
                           label_key = name, label_params = list(size=2)) + 
               xlab("Mean Exit Site Z-Score") + ylab("Secretion") + 
               ggtitle("Formin Highlighted") + 
               theme_bw() + theme(plot.title = element_text(hjust =0.5))

I would also like to highlight just 2 or 3 proteins using their names; this is what I have tried:

plot_ESZ_lab <-ggplot(df, aes(ES.2um_ZS, VWF_Sec, color = Sec, shape = Sec)) + 
               geom_point(aes(size = Mean_ES)) + 
               scale_size_continuous(range=c(0.5,10))+ 
               scale_color_manual(values=c("HAI" = "blue", "PMA" = "red")) + 
               gghighlight(Class == "Formin", name == "FMN2", "DIAPH1", 
                           use_direct_label = TRUE, label_key = name,
                           label_params = list(size=2)) + 
               xlab("Mean Exit Site Z-Score") + ylab("Secretion") +
               ggtitle("Formin Highlighted") + 
               theme_bw() + theme(plot.title = element_text(hjust =0.5))

but only the first name provided to (i.e. FMN2) is ever plotted. How can I get more than 1 point to be plotted, i.e. in this case FMN2 and DIAPH1?

Upvotes: 1

Views: 1092

Answers (1)

M--
M--

Reputation: 28850

In ggplot and generally almost all the functions in , , is used to separate different arguments. You cannot use it to provide multiple inputs to the same variable. You need to write name %in% c("FMN2", "DIAPH1") which translates to name equals to FMN2 or DIAPH1; code below works:

ggplot(df, aes(ES.2um_ZS, VWF_Sec, color = Sec, shape = Sec)) + 
      geom_point(aes(size = Mean_ES)) + 
      scale_size_continuous(range=c(0.5,10))+ 
      scale_color_manual(values=c("HAI" = "blue", "PMA" = "red")) + 
      gghighlight(Class == "Formin", name %in% c("FMN2", "DIAPH1"), 
                  use_direct_label = TRUE, label_key = name,
                  label_params = list(size=2)) + 
      xlab("Mean Exit Site Z-Score") + ylab("Secretion") +
      ggtitle("Formin Highlighted") +  
      theme_bw() + theme(plot.title = element_text(hjust =0.5))

            https://i.sstatic.net/XcjQZ.png

Data:

    df <- structure(list(rowname = 1:6, Class = structure(c(2L, 2L, 3L, 
    3L, 1L, 1L), .Label = c("Arp2/3", "Formin", "Septin"), class = "factor"), 
    Sec = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "HAI", class = "factor"), 
    ES.2um = c(113.3734, 43.90661, 64.32138, 53.15791, 68.67222, 
    71.00643), Mean_WPBs = c(147.1792, 121.9017, 132.7591, 145.7871, 
    161.0516, 149.0704), ES.2um_ZS = c(0.16078492, -0.11594028, 
    -0.16218581, -0.86969449, -0.05404113, -0.38119473), Mean_ES = c(131.69309, 
    75.37296, 66.23765, 81.9269, 82.51804, 82.91458), VWF_Sec = c(162.5219, 
    137.4212, 195.9011, 187.2647, 158.2623, 220.5494), name = structure(c(2L, 
    3L, 5L, 4L, 1L, 6L), .Label = c("ARPC3", "DIAPH1", "FMN2", 
    "LRCH3", "SEPTIN5", "WASF3"), class = "factor")), class = "data.frame",
     row.names = c("1", "2", "3", "4", "5", "6"))

Upvotes: 1

Related Questions