meb
meb

Reputation: 19

How should I organize my data frame to easily compare specific groups and graph them?

I am new to R. I don't have enough experience to know how I should format my data to produce multiple graphs comparing certain groups to each other in R. I have two time points for 3 treatments and 2 controls. I want to be able to create multiple graphs comparing specific groups to each other. T1 and T2 are the timepoints.

test <- structure(list(group = c("control1 T1", "control2 T1", "treatment1 T1", 
"treatment2 T1", "treatment3 T1", "control1 T1", "control2 T1", 
"treatment1 T1", "treatment2 T1", "treatment3 T1", "control1 T1", 
"control2 T1", "treatment1 T1", "treatment2 T1", "treatment3 T1", 
"control1 T2", "control2 T2", "treatment1 T2", "treatment2 T2", 
"treatment3 T2", "control1 T1", "control2 T1", "treatment1 T1", 
"treatment2 T1", "treatment3 T1", "control1 T1", "control2 T1", 
"treatment1 T1", "treatment2 T1", "treatment3 T1", "control1 T1", 
"control2 T1", "treatment1 T1", "treatment2 T1", "treatment3 T1", 
"control1 T2", "control2 T2", "treatment1 T2", "treatment2 T2", 
"treatment3 T2"), value = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 
5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 
1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 4L, 5L, 6L)), class = "data.frame", row.names = c(NA, 
-40L))

I have tried this:

my_comparisons <- list( c("control1 T1", "control1 T2"), c("control2 T1", "control2 T2"), c("treatment1 T1", "treatment1 T2") , c("treatment3 T1", "treatment3 T2"))#

ggboxplot(test, x = "group", y = "value", color = "group", 
          #add = "jitter",
          legend = "none", outlier.shape = NA) + 
  rotate_x_text(angle = 45) + geom_jitter(width = 0.15, alpha = .1, color = "black") +
  stat_compare_means(comparisons = my_comparisons, label.y = c(5, 5, 5, 5, 5, 5))+
  stat_compare_means(label.y = 5)

The graph produced by the above ggboxplot is nice but I want to compare specific groups to each other. for example "treatment1 T1", "treatment1 T2".

I tried facet_wrap.

p <- ggplot(data = test, aes(x=group, y=value)) + 
  geom_boxplot(aes(fill=group))
p + facet_wrap( ~ group, scales="free")

I like this format but I only have one graph per area. Ideally I want to compare two groups within each section. I don't know how to do it. I could manually split the data apart and make each graph one at a time but It should be possible to do it all at once and choose which groups to compare for each facet?

Upvotes: 0

Views: 69

Answers (1)

tamtam
tamtam

Reputation: 3671

If you want to facet the plots by specific groups you need a new column determing the grouping. Underneath I created the column group2 where the same groups with different timestemps (T1, T2) got the same number. (You can replace the numbers with characters if you like)

Note I sampled your value column, because in the orginal test dataset the groups had no variance. Therefore boxplots where shown as a line.

library(tidyverse)

# put some variance in value                                       
test <- test %>%
  mutate(value = sample(1:5, 40, replace = T))

# create new column - group2
test <- test %>% 
  mutate(group2 = case_when(group %in% c("control1 T1", "control1 T2") ~ 1,
                            group %in% c("treatment1 T1", "treatment1 T2") ~ 3,
                            group %in% c("control2 T1", "control2 T2") ~ 2,
                            group %in% c("treatment2 T1", "treatment2 T2") ~ 4,
                            group %in% c("treatment3 T1", "treatment3 T2") ~ 5, 
                            TRUE ~ NA_real_))


# facet by group2
p <- ggplot(data = test, aes(x=group, y=value)) + 
  geom_boxplot(aes(fill=group)) 
p + facet_wrap( ~ as.factor(group2), scales="free")

That's the result.

enter image description here

EDIT: More flexable functions

If you are familiar with regex you can split your facet in the facet_wrap itself. I used the stringr package for the following examples.

# facet by T1 - T2
p <- ggplot(data = test, aes(x=group, y=value)) + 
  geom_boxplot(aes(fill=group)) 
p + facet_wrap( ~ str_extract(group, "T[123]{1}"), scales="free")

# facet by control vs treatment
p <- ggplot(data = test, aes(x=group, y=value)) + 
  geom_boxplot(aes(fill=group)) 
p + facet_wrap( ~ str_extract(group, "treatment|control"), scales="free")

# facet by group 
p <- ggplot(data = test, aes(x=group, y=value)) + 
geom_boxplot(aes(fill=group)) 
p + facet_wrap( ~ str_extract(group, "treatment[123]|control[12]"), scales="free")

Upvotes: 1

Related Questions