Reputation: 45
I am struggling quite a bit with making the labels of a plot look a certain way. I am using ggplot2 and tidyverse.
This is what I have:
I would like to have two headlines (=name) for the legend, one for the cell type HCT, and one for the cell type RKO. Then for HCT and RKO each, I want to have the legend for the Reagent with the respective color, linetype and shape. So basically, I want to break up the color legend into two separate legends. I just can't wrap my head around how to code it. Here is a drawing of what I would like to have instead (for the figure legend; please imagine the orange square is filled in):
Do I need to change my geom_line and geom_point code in order to achieve the legend style I'd like? Or is there another way to do it? I tried searching for a way to do it but couldn't find anything (maybe I am just not using the correct terms). I already tried following what was done here: How to merge color, line style and shape legends in ggplot and Combine legends for color and shape into a single legend but I couldn't get it to work. (In other words, I tried changing scale_shape_manual etc. to accommodate my wishes with no success. I also attempted to use interaction())
Note: I decided not to use facet_wrap since I want to show both of the cell types on the same plot. The plot of the real data looks a little different and it's not as overwhelming. I was able to successfully plot a "facet_wrap" plot with ggpubr.
Note2: I also did not use stat_summary() because I need to take the mean of the same reagent concentration, reagent and cell type. With my data, I did not find a way to make stat_summary work.
Here is the code that I currently have:
mean_mutated <- mutated %>% group_by(Reagent, Reagent.Conc, Cell.type) %>%
summarise(Avg.Viable.Cells = mean(Mean.Viable.Cells.1, na.rm = TRUE))
mutated_0 = mutated %>% group_by(Reagent, Reagent.Conc, Cell.type) %>% filter(Reagent=="0") %>%
summarise(Avg.Viable.Cells = mean(Mean.Viable.Cells.1, na.rm = TRUE))
mutated_1 = mutated %>% group_by(Reagent, Reagent.Conc, Cell.type) %>% filter(Reagent=="1") %>%
summarise(Avg.Viable.Cells = mean(Mean.Viable.Cells.1, na.rm = TRUE))
mutated_2 = mutated %>% group_by(Reagent, Reagent.Conc, Cell.type) %>% filter(Reagent=="2") %>%
summarise(Avg.Viable.Cells = mean(Mean.Viable.Cells.1, na.rm = TRUE))
#linetype by reagent
ggplot() +
#the scatter plot per cell type -> that way I can color them the way I want to, I believe
#the mean/average line plot
geom_point(mean_mutated, mapping= aes(x = as.factor(Reagent.Conc), y = Avg.Viable.Cells, shape=as.factor(Reagent), color=Cell.type)) +
geom_line(mutated_1, mapping= aes(x = as.factor(Reagent.Conc),y = Avg.Viable.Cells, group=Cell.type, color=Cell.type, linetype = "1"))+
geom_line(mutated_2, mapping= aes(x = as.factor(Reagent.Conc),y = Avg.Viable.Cells, group=Cell.type, color=Cell.type, linetype = "2"))+
geom_line(mutated_0, mapping= aes(x = as.factor(Reagent.Conc),y = Avg.Viable.Cells, group=Cell.type, color=Cell.type, linetype = "0"))+
#making the plot look prettier
scale_colour_manual(values = c("#999999", "#E69F00")) +
#scale_linetype_manual(values = c("solid", "dashed", "dotted")) + #for whatever reason, when I add this, the dash in the legend is removed...?
labs(shape = "Reagent", linetype = "Reagent", color="Cell type")+
scale_shape_manual(values=c(15,16,4), labels=c("0", "1", "2"))+
#guides(shape = FALSE)+ #this removes the label that you don't want
#Change the look of the plot and change the axes
xlab("[Reagent] (nM/ml)")+ #change name of x-axis
ylab("Relative viability")+ #change name of y-axis
scale_y_continuous(breaks = scales::pretty_breaks(n = 10))+ #adjust the y-axis so that it has more ticks
expand_limits(y = 0)+
theme_bw() + #this and the next line are to remove the background grid and make it look more publication-like
theme(panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))
And a snapshot of my data frame "mutated" produced by dput(df[9:32, c(1,2,3,4,5)]):
structure(list(Biological.Replicate = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), Reagent.Conc = c(10000, 2500, 625, 156.3, 39.1, 9.8,
2.4, 0.6, 10000, 2500, 625, 156.3, 39.1, 9.8, 2.4, 0.6, 10000,
2500, 625, 156.3, 39.1, 9.8, 2.4, 0.6), Reagent = c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L), Cell.type = c("HCT", "HCT", "HCT", "HCT",
"HCT", "HCT", "HCT", "HCT", "HCT", "HCT", "HCT", "HCT", "HCT",
"HCT", "HCT", "HCT", "RKO", "RKO", "RKO", "RKO", "RKO", "RKO",
"RKO", "RKO"), Mean.Viable.Cells.1 = c(1.014923966, 1.022279854,
1.00926559, 0.936979842, 0.935565248, 0.966403395, 1.00007073,
0.978144524, 1.019673384, 0.991595836, 0.977270557, 1.007353643,
1.111928183, 0.963518289, 0.993028364, 1.027409034, 1.055452733,
0.953801253, 0.956577449, 0.792568337, 0.797052961, 0.755623576,
0.838482346, 0.836773918)), row.names = 9:32, class = "data.frame")
Note3: Even though one column name is "Mean.Viable.Cells.1", this is not the mean I am plotting, but rather the mean of a technical replicate, calculated previously. I am taking the mean of the biological replicates in mutated_0, mutated_1 and mutated_2 to plot it.
Upvotes: 3
Views: 1015
Reputation: 124083
Making use of the ggnewscale
package this could be achieved like so:
Cell.Type
and Reagent
to factors before manipulating the datasetmutate_0
, ... You only need one summary dataset, which I split by Cell.type
to simplify the code later on.ggnewscale::new_scale
to add a second scale and legend for linetype
and shape
. Moreover, remove color
from the aesthetics and set it as an argumentdrop=FALSE
to both scales to keep unused factor levels.Cell.type
.library(ggplot2)
library(dplyr)
mutated <- mutated %>%
mutate(Cell.type = factor(Cell.type, levels = c("HCT", "RKO")),
Reagent = factor(Reagent, levels = c("0", "1", "2"))
)
mean_mutated <- mutated %>%
group_by(Reagent, Reagent.Conc, Cell.type) %>%
summarise(Avg.Viable.Cells = mean(Mean.Viable.Cells.1, na.rm = TRUE)) %>%
split(.$Cell.type)
#> `summarise()` has grouped output by 'Reagent', 'Reagent.Conc'. You can override using the `.groups` argument.
layer_geom_scale <- function(cell_type, color) {
list(
geom_point(mean_mutated[[cell_type]], mapping = aes(shape = Reagent), color = color),
geom_line(mean_mutated[[cell_type]], mapping = aes(group = Reagent, linetype = Reagent), color = color),
scale_linetype_manual(name = cell_type, values = c("solid", "dashed", "dotted"), drop=FALSE),
scale_shape_manual(name = cell_type, values = c(15, 16, 4), labels = c("0", "1", "2"), drop=FALSE)
)
}
# linetype by reagent
ggplot(mapping = aes(
x = as.factor(Reagent.Conc),
y = Avg.Viable.Cells
)) +
layer_geom_scale("HCT", "#999999") +
ggnewscale::new_scale("linetype") +
ggnewscale::new_scale("shape") +
layer_geom_scale("RKO", "#E69F00") +
scale_y_continuous(breaks = scales::pretty_breaks(n = 10), limits = c(0, NA)) +
theme_bw() +
theme(
panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black")
) +
labs(shape = "Reagent",
linetype = "Reagent",
color = "Cell type",
x = "[Reagent] (nM/ml)",
y = "Relative viability")
Upvotes: 2