KT_1
KT_1

Reputation: 8494

Grouped ggplot boxplot in R

For a sample dataframe:

   df <- structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L), letter_group = c("A", "A", "A", "B", "B", "B", "C", 
"C", "C", "C", "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", 
"A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", 
"A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C"), value = c(2L, 
3L, 4L, 5L, 6L, 6L, 7L, 8L, 5L, 6L, 7L, 3L, 4L, 5L, 6L, 4L, 5L, 
6L, 2L, 3L, 4L, 4L, 5L, 6L, 7L, 8L, 5L, 3L, 2L, 4L, 5L, 6L, 4L, 
3L, 4L, 5L, 6L, 7L, 1L, 2L, 4L, 5L, 6L, 4L)), .Names = c("year", 
"letter_group", "value"), row.names = c(NA, -44L), class = c("tbl_df", 
"tbl", "data.frame"), spec = structure(list(cols = structure(list(
    year = structure(list(), class = c("collector_integer", "collector"
    )), letter_group = structure(list(), class = c("collector_character", 
    "collector")), value = structure(list(), class = c("collector_integer", 
    "collector"))), .Names = c("year", "letter_group", "value"
)), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

I am trying to create a box plot which comprises the years on the x axes - but also the 'letter-groups' grouped by year...

i.e. A, B, C for year 1, then a small space then A, B C for year 2 and so on....

I have the following:

library(ggplot2)

p1 <- ggplot(df, aes(year, value))
p1 + geom_boxplot(aes(group=letter_group))

But this is only producing the 3 box plots.

Could someone please help me?

Upvotes: 1

Views: 121

Answers (3)

Dan
Dan

Reputation: 12084

An alternative to @nouse's solution (which is the best solution) is to use faceting. One benefit of faceting, however, is that you also get letter group labels on the x-axis.

Define data structure

# Load library
library(ggplot2)

# Define data frame
df <- structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                              2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
                              3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                              4L, 4L), letter_group = c("A", "A", "A", "B", "B", "B", "C", 
                                                        "C", "C", "C", "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", 
                                                        "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", 
                                                        "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C"), 
                     value = c(2L, 3L, 4L, 5L, 6L, 6L, 7L, 8L, 5L, 6L, 7L, 3L, 4L, 5L, 6L, 4L, 5L, 
                               6L, 2L, 3L, 4L, 4L, 5L, 6L, 7L, 8L, 5L, 3L, 2L, 4L, 5L, 6L, 4L, 
                               3L, 4L, 5L, 6L, 7L, 1L, 2L, 4L, 5L, 6L, 4L)), 
                .Names = c("year", "letter_group", "value"), 
                row.names = c(NA, -44L), 
                class = c("tbl_df","tbl", "data.frame"), 
                spec = structure(list(cols = structure(list( ear = structure(list(), class = c("collector_integer", "collector")), 
                                                             letter_group = structure(list(), class = c("collector_character", "collector")), 
                                                             value = structure(list(), class = c("collector_integer",  "collector"))), 
                                                       .Names = c("year", "letter_group", "value")), 
                                      default = structure(list(), class = c("collector_guess", "collector"))), 
                                 .Names = c("cols", "default"), class = "col_spec"))

Plot results

# Plot results
g <- ggplot(df)
g <- g + geom_boxplot(aes(letter_group, value))
g <- g + facet_grid(. ~ year, switch = "x")
g <- g + theme(strip.placement = "outside",
               strip.background = element_blank(),
               panel.background = element_rect(fill = "white"),
               panel.grid.major = element_line(colour = alpha("gray50", 0.25), linetype = "dashed"))
g <- g + ylab("Value") + xlab("Year & Letter Group")
print(g)

Created on 2019-05-23 by the reprex package (v0.2.1)

Upvotes: 3

nouse
nouse

Reputation: 3471

Your question has been largely answered here.

Your dataframe does not include factors, so you would first need to turn your grouping variables into factors. Then, there are two options, as per link given above. Either construct a new factor by combining your two original factors (as shown in z-cool's answer) - but this does not create the desired space between factor levels on the x-axis - or you would need to assign one of your factors to fill, or col. In your case, the quickest way to solve your problem is

ggplot(df, aes(as.factor(year), value, fill=as.factor(letter_group))) + geom_boxplot()

If you do not want to colorize your plot, you can change this with scale_fill_manual or scale_color_manual, depending on your choice in aes before:

ggplot(df, aes(as.factor(year), value, fill=as.factor(letter_group))) + geom_boxplot() +
  scale_fill_manual(values=c("white", "white", "white")) +
  theme(legend.position = "none")

Upvotes: 1

zerocool
zerocool

Reputation: 369

This should work

library(tidyverse)
df %>% 
  mutate(year_group = paste(year, letter_group)) %>% 
  ggplot(aes(year_group, value)) +
  geom_boxplot()

Upvotes: -1

Related Questions