Reputation: 21
when using the simple R boxplot function, I can easily place my dataframe directly into the parenthesis and a perfect boxplot emerges, eg:
baseline <- c(0,0,0,0,1)
post_cap <- c(1,5,5,6,11)
qx314 <- c(0,0,0,3,7)
naive_capqx <- data.frame(baseline, post_cap, qx314)
boxplot(naive_capqx)
this is an image of the boxplot made with the simple R boxplot function
However, I need to make this boxplot slightly more aesthetic and so I need to use ggplot. When I place the dataframe itself in, the boxplot cannot form as I need to specify x, y and fill coordinates, which I don't have. My y coordinates are the values for each vector in the dataframe and my x coordinates are just the name of the vector. How can I do this using ggplot? Is there a way to reform my dataframe so I can split it into coordinates, or is there a way ggplot can read my data?
Upvotes: 2
Views: 1400
Reputation: 39
Turn the df into a long format df. Below, I use gather()
to lengthen the df; I use group_by()
to ensure boxplot calculation by key (formerly column name).
pacman::p_load(ggplot2, tidyverse)
baseline <- c(0,0,0,0,1)
post_cap <- c(1,5,5,6,11)
qx314 <- c(0,0,0,3,7)
naive_capqx <- data.frame(baseline, post_cap, qx314) %>%
gather("key", "value")) %>%
group_by(key)
ggplot(naive_capqx, mapping = aes(x = key, y = value)) +
geom_boxplot()
Upvotes: 1
Reputation: 12586
geom_boxplot
expects tidy data. Your data isn't tidy because the column names contain information. So the first thing to do is to tidy your data by using pivot_longer
...
library(tidyverse)
naive_capqx %>%
pivot_longer(everything(), values_to="Value", names_to="Variable") %>%
ggplot() +
geom_boxplot(aes(x=Variable, y=Value))
giving
Upvotes: 2