Nurit Eliana
Nurit Eliana

Reputation: 21

Making a ggplot boxplot where each column is it's own boxplot

when using the simple R boxplot function, I can easily place my dataframe directly into the parenthesis and a perfect boxplot emerges, eg:

baseline <- c(0,0,0,0,1)
post_cap <- c(1,5,5,6,11)
qx314 <- c(0,0,0,3,7)
naive_capqx <- data.frame(baseline, post_cap, qx314)
boxplot(naive_capqx)

this is an image of the boxplot made with the simple R boxplot function

However, I need to make this boxplot slightly more aesthetic and so I need to use ggplot. When I place the dataframe itself in, the boxplot cannot form as I need to specify x, y and fill coordinates, which I don't have. My y coordinates are the values for each vector in the dataframe and my x coordinates are just the name of the vector. How can I do this using ggplot? Is there a way to reform my dataframe so I can split it into coordinates, or is there a way ggplot can read my data?

Upvotes: 2

Views: 1400

Answers (2)

Rebecca Amodeo
Rebecca Amodeo

Reputation: 39

Turn the df into a long format df. Below, I use gather() to lengthen the df; I use group_by() to ensure boxplot calculation by key (formerly column name).

pacman::p_load(ggplot2, tidyverse)

baseline <- c(0,0,0,0,1)
post_cap <- c(1,5,5,6,11)
qx314 <- c(0,0,0,3,7)

naive_capqx <- data.frame(baseline, post_cap, qx314) %>%
  gather("key", "value")) %>%
  group_by(key)
  

ggplot(naive_capqx, mapping = aes(x = key, y = value)) +
  geom_boxplot()

Upvotes: 1

Limey
Limey

Reputation: 12586

geom_boxplot expects tidy data. Your data isn't tidy because the column names contain information. So the first thing to do is to tidy your data by using pivot_longer...

library(tidyverse)

naive_capqx %>%  
  pivot_longer(everything(), values_to="Value", names_to="Variable") %>% 
  ggplot() +
  geom_boxplot(aes(x=Variable, y=Value))

giving

enter image description here

Upvotes: 2

Related Questions