Reputation: 15
I have a problem with spacing data points in a boxplot. I use the following code.
DF1 <- data.frame(x = c(1, 2, 3, 4, 7, 11, 20, 23, 24, 25, 30), y = c(3, 6, 12, 13, 17, 22, NA, NA, NA, NA, NA))
library(ggplot2)
library(tidyverse)
n <- 11
DF1 <- as.data.frame(DF1)
DF1 <- reshape2::melt(DF1)
DF1 %>%
group_by(variable) %>%
arrange(value) %>%
mutate(xcoord = seq(-0.25, 0.25, length.out = n())) %>%
ggplot(aes(x = variable, y = value, group = variable)) +
geom_boxplot() +
geom_point(aes(x = xcoord + as.integer(variable)))
This results in the following:
For x, all data points are evenly distributed left to right, but since y has fewer data points, they are not evenly distributed left to right. How can the above code be modified to evenly space out data points for y too? I would appreciate any suggestions.
I found a somewhat similar post here, but that could not help me.
Thank you.
Upvotes: 0
Views: 212
Reputation: 145765
The problem is the NA
values in y
. After you go to long format, you can simply omit them:
plot_data = DF1 %>%
na.omit %>% ## add this here
group_by(variable) %>%
arrange(value) %>%
mutate(xcoord = seq(-0.25, 0.25, length.out = n()))
ggplot(plot_data, aes(x = variable, y = value, group = variable)) +
geom_boxplot() +
geom_point(aes(x = xcoord + as.integer(variable)))
Upvotes: 2