TheAvenger
TheAvenger

Reputation: 458

R ggplot facet_grid multi boxplot

Using ggplot and facet_grid, I'd like to visualize two parallel vector of values through a box plot. My available data:

DF <- data.frame("value" =  runif(50, 0, 1),
             "value2" = runif(50,0,1),
             "type1" = c(rep("AAAAAAAAAAAAAAAAAAAAAA", 25), 
                         rep("BBBBBBBBBBBBBBBBB", 25)),
             "type2" = rep(c("c", "d"), 25), 
             "number" = rep(2:6, 10))

The code at the moment permit to visualize only one vector of values:

ggplot(DF, aes(y=value, x=type1)) + 
  geom_boxplot(alpha=.3, aes(fill = type1)) + 
  ggtitle("TITLE") + 
  facet_grid(type2 ~ number) +
  scale_x_discrete(name = NULL, breaks = NULL) + # these lines are optional
  theme(legend.position = "bottom")

This is my plot at the moment.

enter image description here

I'd like to visualize a parallel box plot one for each vector (value and value2 in dataframe). Then for each colored boxplot, I'd like to have two boxplot one for value and another one for value2

Upvotes: 0

Views: 5244

Answers (2)

camille
camille

Reputation: 16881

I think there's likely a post that already addresses it, in addition to the one I linked to above. But this is a problem of two things: 1) getting data into the format that ggplot expects, i.e. long-shaped so there are values to map onto aesthetics, and 2) separation of concerns, in that you can use reshape2 or (more up-to-date) tidyr functions to get data into the proper shape, and ggplot2 functions to plot it.

You can use tidyr::gather for getting long data, and conveniently pipe it directly into ggplot.

library(tidyverse)
...

To illustrate, though with very generic column names:

DF %>%
  gather(key, value = val, value, value2) %>%
  head()
#>                    type1 type2 number   key       val
#> 1 AAAAAAAAAAAAAAAAAAAAAA     c      2 value 0.5075600
#> 2 AAAAAAAAAAAAAAAAAAAAAA     d      3 value 0.6472347
#> 3 AAAAAAAAAAAAAAAAAAAAAA     c      4 value 0.7543778
#> 4 AAAAAAAAAAAAAAAAAAAAAA     d      5 value 0.7215786
#> 5 AAAAAAAAAAAAAAAAAAAAAA     c      6 value 0.1529630
#> 6 AAAAAAAAAAAAAAAAAAAAAA     d      2 value 0.8779413

Pipe that directly into ggplot:

DF %>%
  gather(key, value = val, value, value2) %>%
  ggplot(aes(x = key, y = val, fill = type1)) +
    geom_boxplot() +
    facet_grid(type2 ~ number) +
    theme(legend.position = "bottom")

Again, because of some of the generic column names, I'm not entirely sure this is the setup you want—like I don't know the difference in value / value2 vs AAAAAAA / BBBBBBB. You might need to swap aes assignments around accordingly.

Upvotes: 1

Freakazoid
Freakazoid

Reputation: 520

You have to reshape your data frame. Use an additionally indicator (column) which defines the type of value (for example "value_type") and only define one value column. The indicator will than match the value to the corresponding value type. The following code will reshape your example:

DF <- data.frame("value" =  c(runif(50, 0, 1), runif(50,0,1)),
                 "value_type" = rep(c("value1","value2"), each=50),
                 "type1" = rep(c(rep("AAAAAAAAAAAAAAAAAAAAAA", 25), 
                                 rep("BBBBBBBBBBBBBBBBB", 25)), 2),
                 "type2" = rep(rep(c("c", "d"), 25), 2), 
                 "number" = rep(rep(2:6, 10),2))

Use ggplot additionaly with an color argument:

ggplot(DF, aes(y=value, x=type1, col=value_type)) + 
  geom_boxplot(alpha=.3, aes(fill = type1)) + 
  ggtitle("TITLE") + 
  facet_grid(type2 ~ number) +
  scale_color_manual(values=c("green", "steelblue")) + # set the color of the values manualy
  scale_x_discrete(name = NULL, breaks = NULL) +# these lines are optional
  theme(legend.position = "bottom")

Upvotes: 0

Related Questions