Kirthana
Kirthana

Reputation: 23

ggplot's geom_boxplot(outlier.size = NA, position = position_dodge(width = 0.8)) Outlier not ommitted and Position Dodge does not work

I am working on a data visualization in R Shiny where I aim to display the spread of avg_speed for each hour of the day using boxplots. My goal is to represent each road as a separate boxplot within the same graph, displayed side by side for each hour.

For example, if the time range is 1 AM to 2 AM and the user selects three roads, the graph should show three distinct boxplots (one for each road) side by side, all grouped under the 1 AM to 2 AM label. However, in my current implementation, the boxplots are overlapping, which affects the clarity of the visualization.

Even though, I am including outlier.size = NA, outliers are still visible.

The code is as follows:

output$boxplot <- renderPlotly({
  p <- ggplot(aggregated_data, aes(x = time_by_hour, y = avg_speed, fill = RoadName.x)) + 
    geom_boxplot(outlier.size = NA, position = position_dodge(width = 0.8)) +
    labs(
      title = "Distribution of average speed (in mph) by each hour (Hover over the boxplots to know more info)",
      subtitle = "Hover over the boxplots to know more info",
      x = "Time by Hour",
      y = "Average Speed (mph)",
      legend = "Road") +
    theme_minimal(base_size = 15) +  
    theme(
      panel.background = element_rect(fill = "white"),        
      panel.grid.major = element_line(color = "#f2f0ef"),   
      panel.grid.minor = element_line(color = "#f2f0ef"),
      plot.title = element_text(size = 12),
      plot.subtitle = element_text(size = font_size_for_boxplot),          
      axis.title.x = element_text(size = font_size_for_boxplot),  
      axis.title.y = element_text(size = font_size_for_boxplot),          
      axis.text.x = element_text(size = font_size_for_boxplot) ,     
      axis.text.y = element_text(size = font_size_for_boxplot) ,
      legend.position = "right",                                     
      legend.text = element_text(size = 8),                          
      legend.title = element_text(size = 8),                            
    ) 
  # Add hover information
  ggplotly(p, tooltip = c("time_by_hour", "avg_speed", "RoadName.x")) #, "RoadName.x"
})

Here are the specific requirements:

  1. The x-axis should display the time range (hour of the day), and boxplots should be grouped under each hour.
  2. For each hour, multiple boxplots (one for each selected road) should appear side by side.
  3. Facets are not preferred as I want all roads to appear in the same graph.
  4. Outliers should not be visualized to maintain a cleaner appearance.

I would appreciate any guidance or suggestions on how to modify the code to achieve this layout effectively.

The dput(aggregated_data):

structure(list(
    RoadName.x = c("North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road", "North Craycroft Road", "North Craycroft Road", 
    "North Craycroft Road"), avg_speed = c(33.6666666666667, 
    34, 34, 34, 34, 33.3333333333333, 34.3333333333333, 34, 34, 
    34, 34, 34, 34.3333333333333, 34.3333333333333, 34.3333333333333, 
    35.3333333333333, 34.6666666666667, 34, 33, 33.6666666666667, 
    31.6666666666667, 31.3333333333333, 31, 31, 31, 32.6666666666667, 
    33.6666666666667, 34.3333333333333, 35, 35, 35.6666666666667, 
    35.6666666666667, 34, 34.3333333333333, 34.6666666666667, 
    35.3333333333333, 35, 35, 34.3333333333333, 33.3333333333333, 
    32.6666666666667, 31.3333333333333, 31.3333333333333, 31.3333333333333, 
    31.6666666666667), time_by_hour = c("14", "14", "14", "14", "14", 
    "14", "14", "14", "14", "14", "14", "14", "14", "14", "14", 
    "14", "14", "14", "14", "14", "14", "14", "14", "14", "14", 
    "14", "14", "14", "14", "14", "15", "15", "15", "15", "15", 
    "15", "15", "15", "15", "15", "15", "15", "15", "15", "15"
    )), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -45L))

The current Output is attached as a screenshot enter image description here

Upvotes: 0

Views: 45

Answers (0)

Related Questions