Komal Rathi
Komal Rathi

Reputation: 4274

Plotly: Annotate outliers with sample names in boxplot

I am trying to create a boxplot with ggplot and plotly with the dataset airquality where Month is on the x-axis and Ozone values are on y-axis. My aim is to annotate the plot so that when I hover over the outlier points it should show the Sample name in addition to the Ozone value:

library(tidyverse)
library(plotly)
library(datasets)
data(airquality)

# add months
airquality$Month <- factor(airquality$Month,
                           labels = c("May", "Jun", "Jul", "Aug", "Sep"))

# add sample names
airquality$Sample <- paste0('Sample_',seq(1:nrow(airquality)))

# boxplot
p <- ggplot(airquality, aes(x = Month, y = Ozone)) +
  geom_boxplot()
p <- plotly_build(p)
p

Here is the plot that's created:

enter image description here

By default, when I hover over each of the boxes, it shows the basic summary stats of the x-axis variable. However, what I would also like to see is what the outlier samples are. For e.g. when hovering over May, it shows the outlier value 115 but it does not show that it is actually Sample_30.

How can I add the Sample variable to the outlier points so it shows both the outlier value as well as the sample name?

Upvotes: 4

Views: 3759

Answers (4)

Isaac Zhao
Isaac Zhao

Reputation: 429

This method will achieve the same result but does not show the boxplot summary statistics hover. Removes outlier and hover on boxplot layer and overlays a geom_point layer of only outliers with hover info. The definition of outliers for plotly are stated here. This method would work better than other solutions when dealing with more complex graphs (e.g. grouped side by side boxplots). Interestingly, the ggplotly boxplot graph for this data is not the same as the ggplot graph. The upper fence whisker for Aug in ggplotly extends much further than the ggplot upper fence whisker for Aug.

library(dplyr)
library(plotly)
library(datasets)
library(ggplot2)
data(airquality)

# manipulate data
mydata = airquality %>% 
    # add months
    mutate(Month = factor(airquality$Month,labels = c("May", "Jun", "Jul", "Aug", "Sep")),
    # add sample names
           Sample = paste0('Sample_',seq(1:n())))%>%
    # label if outlier sample by Month
    group_by(Month) %>% 
    mutate(OutlierFlag = ifelse((Ozone<quantile(Ozone,1/3,na.rm=T)-1.5*IQR(Ozone,na.rm=T)) | (Ozone>quantile(Ozone,2/3,na.rm=T)+1.5*IQR(Ozone,na.rm=T)),'Outlier','NotOutlier'))%>%
    group_by()


# boxplot
p <- ggplot(mydata, aes(x = Month, y = Ozone)) +
    geom_boxplot()+
    geom_point(data=mydata %>% filter(OutlierFlag=="Outlier"),aes(group=Month,label1=Sample,label2=Ozone),size=2)

output = ggplotly(p, tooltip=c("label1","label2"))

# makes boxplot outliers invisible and hover info off
for (i in 1:length(output$x$data)){
    if (output$x$data[[i]]$type=="box"){
        output$x$data[[i]]$marker$opacity = 0  
        output$x$data[[i]]$hoverinfo = "none"
    }
}

# print end result of plotly graph
output

boxplot with outlier ID hover

Upvotes: 2

St&#233;phane Laurent
St&#233;phane Laurent

Reputation: 84529

I've managed to achieve this with Shiny.

library(plotly)
library(shiny)
library(htmlwidgets)
library(datasets)

# Prepare data ----
data(airquality)
# add months
airquality$Month <- factor(airquality$Month,
                           labels = c("May", "Jun", "Jul", "Aug", "Sep"))
# add sample names
airquality$Sample <- paste0('Sample_', seq(1:nrow(airquality)))

# Plotly on hover event ----
addHoverBehavior <- c(
  "function(el, x){",
  "  el.on('plotly_hover', function(data) {",
  "    if(data.points.length==1){",
  "      $('.hovertext').hide();",
  "      Shiny.setInputValue('hovering', true);",
  "      var d = data.points[0];",
  "      Shiny.setInputValue('left_px', d.xaxis.d2p(d.x) + d.xaxis._offset);",
  "      Shiny.setInputValue('top_px', d.yaxis.l2p(d.y) + d.yaxis._offset);",
  "      Shiny.setInputValue('dx', d.x);",
  "      Shiny.setInputValue('dy', d.y);",
  "      Shiny.setInputValue('dtext', d.text);",
  "    }",
  "  });",
  "  el.on('plotly_unhover', function(data) {",
  "    Shiny.setInputValue('hovering', false);",
  "  });",
  "}")

# Shiny app ----
ui <- fluidPage(
  tags$head(
    # style for the tooltip with an arrow (http://www.cssarrowplease.com/)
    tags$style("
               .arrow_box {
                    position: absolute;
                  pointer-events: none;
                  z-index: 100;
                  white-space: nowrap;
                  background: rgb(54,57,64);
                  color: white;
                  font-size: 14px;
                  border: 1px solid;
                  border-color: rgb(54,57,64);
                  border-radius: 1px;
               }
               .arrow_box:after, .arrow_box:before {
                  right: 100%;
                  top: 50%;
                  border: solid transparent;
                  content: ' ';
                  height: 0;
                  width: 0;
                  position: absolute;
                  pointer-events: none;
               }
               .arrow_box:after {
                  border-color: rgba(136, 183, 213, 0);
                  border-right-color: rgb(54,57,64);
                  border-width: 4px;
                  margin-top: -4px;
               }
               .arrow_box:before {
                  border-color: rgba(194, 225, 245, 0);
                  border-right-color: rgb(54,57,64);
                  border-width: 10px;
                  margin-top: -10px;
               }")
  ),
  div(
    style = "position:relative",
    plotlyOutput("myplot"),
    uiOutput("hover_info")
  )
)

server <- function(input, output){
  output$myplot <- renderPlotly({
    airquality[[".id"]] <- seq_len(nrow(airquality))
    gg <- ggplot(airquality, aes(x=Month, y=Ozone, ids=.id)) + geom_boxplot()
    ggly <- ggplotly(gg, tooltip = "y")
    ids <- ggly$x$data[[1]]$ids
    ggly$x$data[[1]]$text <- 
      with(airquality, paste0("<b> sample: </b>", Sample, "<br/>",
                              "<b> month: </b>", Month, "<br/>",
                              "<b> ozone: </b>", Ozone))[ids]
    ggly %>% onRender(addHoverBehavior)
  })
  output$hover_info <- renderUI({
    if(isTRUE(input[["hovering"]])){
      style <- paste0("left: ", input[["left_px"]] + 4 + 5, "px;", # 4 = border-width after
                      "top: ", input[["top_px"]] - 24 - 2 - 1, "px;") # 24 = line-height/2 * number of lines; 2 = padding; 1 = border thickness
      div(
        class = "arrow_box", style = style,
        p(HTML(input$dtext), 
          style="margin: 0; padding: 2px; line-height: 16px;")
      )
    }
  })
}

shinyApp(ui = ui, server = server)

enter image description here

Upvotes: 2

St&#233;phane Laurent
St&#233;phane Laurent

Reputation: 84529

We can almost get it like this:

library(ggplot2)
library(plotly)
library(datasets)
data(airquality)
# add months
airquality$Month <- factor(airquality$Month,
                           labels = c("May", "Jun", "Jul", "Aug", "Sep"))
# add sample names
airquality$Sample <- paste0('Sample_',seq(1:nrow(airquality)))
# boxplot
gg <- ggplot(airquality, aes(x = Month, y = Ozone)) +
  geom_boxplot()
ggly <- ggplotly(gg)
# add hover info
hoverinfo <- with(airquality, paste0("sample: ", Sample, "</br></br>", 
                                     "month: ", Month, "</br>",
                                     "ozone: ", Ozone))
ggly$x$data[[1]]$text <- hoverinfo
ggly$x$data[[1]]$hoverinfo <- c("text", "boxes")

ggly

enter image description here

Unfortunately, the hovering does not work for the first box plot...

Upvotes: 6

Alex Yahiaoui Martinez
Alex Yahiaoui Martinez

Reputation: 994

I found solution on https://github.com/ropensci/plotly/issues/887

Try to make this kind of code !

 library(plotly)

 vals <- boxplot(airquality$Ozone,plot = FALSE)
 y <- airquality[airquality$Ozone > vals$stats[5,1] | airquality$Ozone < vals$stats[1,1],]

plot_ly(airquality,y = ~Ozone,x = ~Month,type = "box") %>% 
   add_markers(data = y, text = y$Day)

Upvotes: 0

Related Questions