Alice Hobbs
Alice Hobbs

Reputation: 1215

Specialised Boxplot: Plotting Lines to the Error Bars to Highlight the Data Range in R

Overview

I have a data frame called ANOVA.Dataframe.1 (see below) containing the dependent variable called 'Canopy_Index', and the independent variable called 'Urbanisation_index".

My aim is to produce a boxplot (exactly the same as the desired result below) for Canopy Cover (%) for each category of the Urbanisation Index with plotted lines pointing towards both the bottom and top of the error bars to highlight the data range.

I have searched intensively in order to find the code to produce the desired boxplot this (please see the desired result), but I was unsuccessful, and I'm also unsure if these boxplots have a specialised name.

Perhaps this can be achieved in either ggplot or Base R

If anyone can help, I would be deeply appreciative.

Desired Result ( Reference)

enter image description here

I can produce an ordinary boxplot with the R-code below, but I cannot figure out how to implement the lines pointing towards the ends of the error bars.

R-code

Boxplot.obs1.Canopy.Urban<-boxplot(ANOVA.Dataframe.1$Canopy_Index~ANOVA.Dataframe.1$Urbanisation_index,
                               main="Mean Canopy Index (%) for Categories of the Urbansiation Index",
                               xlab="Urbanisation Index",
                               ylab="Canopy Index (%)")

Boxplot produced from R-code

enter image description here

Data frame 1

structure(list(Urbanisation_index = c(2, 2, 4, 4, 3, 3, 4, 4, 
4, 2, 4, 3, 4, 4, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 
2, 2, 2, 4, 4, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 4, 4, 4, 
4, 4, 4, 4), Canopy_Index = c(65, 75, 55, 85, 85, 85, 95, 85, 
85, 45, 65, 75, 75, 65, 35, 75, 65, 85, 65, 95, 75, 75, 75, 65, 
75, 65, 75, 95, 95, 85, 85, 85, 75, 75, 65, 85, 75, 65, 55, 95, 
95, 95, 95, 45, 55, 35, 55, 65, 95, 95, 45, 65, 45, 55)), row.names = c(NA, 
-54L), class = "data.frame")

Dataframe 2

structure(list(Urbanisation_index = c(2, 2, 4, 4, 3, 3, 4, 4, 
4, 3, 4, 4, 4, 4, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 
2, 2, 2, 4, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 4, 4, 4, 4, 4, 4, 4
), Canopy_Index = c(5, 45, 5, 5, 5, 5, 45, 45, 55, 15, 35, 45, 
5, 5, 5, 5, 5, 5, 35, 15, 15, 25, 25, 5, 5, 5, 5, 5, 5, 15, 25, 
15, 35, 25, 45, 5, 25, 5, 5, 5, 5, 55, 55, 15, 5, 25, 15, 15, 
15, 15)), row.names = c(NA, -50L), class = "data.frame")

Upvotes: 0

Views: 112

Answers (1)

Paweł Chabros
Paweł Chabros

Reputation: 2399

Alice, is this what you are looking for?

enter image description here

You can do everything with ggplot2, but for non standard things you have to play with it for a while. My code:

library(tidyverse)
library(wrapr)

df %.>%
  ggplot(data = ., aes(
    x = Urbanisation_index,
    y = Canopy_Index,
    group = Urbanisation_index
  )) +
  stat_boxplot(
    geom = 'errorbar',
    width = .25
  ) +
  geom_boxplot() +
  geom_line(
    data = group_by(., Urbanisation_index) %>%
      summarise(
        bot = min(Canopy_Index),
        top = max(Canopy_Index)
      ) %>%
      gather(pos, val, bot:top) %>% 
      select(
        x = Urbanisation_index,
        y = val
      ) %>%
      mutate(gr = row_number()) %>%
      bind_rows(
        tibble(
          x = 0,
          y = max(.$y) * 1.15,
          gr = 1:8
        )
      ),
    aes(
      x = x,
      y = y,
      group = gr
    )) +
  theme_light() +
  theme(panel.grid = element_blank()) +
  coord_cartesian(
    xlim = c(min(.$Urbanisation_index) - .5, max(.$Urbanisation_index) + .5),
    ylim = c(min(.$Canopy_Index) * .95, max(.$Canopy_Index) * 1.05)
  ) +
  ylab('Company Index (%)') +
  xlab('Urbanisation Index')

Upvotes: 1

Related Questions