Alphonse Charbel
Alphonse Charbel

Reputation: 37

How to use generate multiple colour scales in ggplot for every variable

I am trying to generate a ggplot with the following aesthetic (see below) on relatively small data set (x = case number, y1, y2, y3 etc. = multiple variables pertaining to different characteristics of the cases)

Case <- c("Case 1", "Case 2", "Case 3", "Case 4", "Case 5")
Age <- c(53, 46, 72, 68, 45)
Tumor_Stage <- c(1, 2, 3, 1, 2) 
Tumor_Grade <- c(3, 1, 2, 2, 1)
Smoking_Status <- c(0,1 ,1 ,0 ,1)
CD3 <- c(0,1,0,0,1)
df <- tibble(Case, Age, Tumor_Stage, Tumor_Grade, Smoking_Status, CD3)
df1 <- df %>% pivot_longer(cols = c(Age, Tumor_Stage, Tumor_Grade, Smoking_Status,CD3),
                                          names_to = "Variables")

ggplot(df1,aes(x = Case, 
                     y = Variables, 
                     col = value,
                     fill = value)) +
  geom_tile()

I am getting the following plot:

My Attempt

BUT the problem is: one scale is being generated for ALL variables, and I want to make a different color scale for every single y variable. What is the best way to do this? Thank you so much for your help!

My Objective, something similar to this

I am just stuck here and don't know how to continue... Ive seen examples using

scale_color_gradient()
  new_scale_color() +

but I don't know how to integrate this function for every single y variable

Upvotes: 1

Views: 394

Answers (1)

stefan
stefan

Reputation: 124258

One option to get an individual fill scale and legend for each variable via ggnewscale would be to use multiple geom_tile layers, one for each variable. To this end split your dataframe by Variable, then use e.g. purrr::imap to add the single layers:

library(ggplot2)
library(ggnewscale)

df1_split <- split(df1, df1$Variables)

legend_order <- rev(seq_along(df1_split))
names(legend_order) <- names(df1_split)

ggplot(df1, aes(
  x = Case,
  y = Variables
)) +
  purrr::imap(df1_split, function(x, y) {
    order <- legend_order[[y]]
    list(
      geom_tile(aes(fill = value), data = x),
      scale_fill_gradient(
        name = y,
        guide = guide_colorbar(direction = "horizontal", title.position = "top", order = order)
      ),
      new_scale_fill()
    )
  })

A second option to achieve your desired result would be to create separate plots for each variable and glue them together using e.g. patchwork. While this approach requires some more effort to make the patch look like one plot, i.e. setting the plot margins and removing the axis, one advantage is that the legends are nicely aligned with y axis categories. And I would guess that this approach was used for the example plot which you added as an image.


library(patchwork)

plot_fun <- function(x, y) {
  theme_adjust <- if (y != "Tumor_Stage") {
    theme(
      axis.line.x = element_blank(),
      axis.text.x = element_blank(),
      axis.title.x = element_blank(),
      axis.ticks.x = element_blank(),
      axis.ticks.length.x = unit(0, "pt")
    )  
  }
  
  plot_margin <- if (y == "Tumor_Stage") {
    theme(plot.margin = margin(0, 5.5, 5.5, 5.5))
  } else if (y == "Age") {
    theme(plot.margin = margin(5.5, 5.5, 0, 5.5))
  } else {
    theme(plot.margin = margin(0, 5.5, 0, 5.5))
  }
  
  ggplot(df1, aes(
    x = Case,
    y = Variables
  )) +
    geom_tile(aes(fill = value), data = x) +
    scale_fill_gradient(
          name = y,
          guide = guide_colorbar(direction = "horizontal", title.position = "top")
        ) +
    scale_y_discrete(expand = c(0, 0)) +
    theme_adjust +
    plot_margin +
    labs(y = NULL)
}

purrr::imap(df1_split, plot_fun) |> 
  wrap_plots(ncol = 1)

EDIT Concerning your second question. If you have a mix of categorical and numerical variables I would suggest to use the data in wide format. For the example below I slightly altered the example data and converted Smoking_Status and Tumor_Stage to factors. Concerning the fill scales. There are in general various approach. An easy but probably not the most elegant approach would be to create a list of fill scales, i.e. a list containing the desired fill scale for each variable. I also opted for the patchwork approach. Note that as I now use the wide dataset it's no longer necessary to split the dataset. Instead we have to loop over the column names.

Case <- c("Case 1", "Case 2", "Case 3", "Case 4", "Case 5")
Age <- c(53, 46, 72, 68, 45)
Tumor_Stage <- c(1, 2, 3, 1, 2)
Tumor_Grade <- c(3, 1, 2, 2, 1)
Smoking_Status <- c(0, 1, 1, 0, 1)
CD3 <- c(0, 1, 0, 0, 1)
df <- data.frame(Case, Age, Tumor_Stage, Tumor_Grade, Smoking_Status, CD3)

df$Smoking_Status <- factor(df$Smoking_Status)
df$Tumor_Stage <- factor(df$Tumor_Stage)

library(ggplot2)
library(patchwork)

cols <- c("Age", "Tumor_Stage", "Tumor_Grade", "Smoking_Status", "CD3")
cols <- sort(cols)

scale_fill <- lapply(cols, function(x) {
  if (x == "Smoking_Status") {
    scale_fill_brewer(type = "div", name = x, palette = "BrBG",
                      guide = guide_legend(direction = "horizontal", title.position = "top"))
  } else if (x == "Tumor_Stage") {
    scale_fill_brewer(type = "div", name = x, palette = "PiYG",
                      guide = guide_legend(direction = "horizontal", title.position = "top"))
  } else if (x == "Age") {
    scale_fill_gradient(name = x, low = "lightgreen", high = "darkgreen",
                        guide = guide_colorbar(direction = "horizontal", title.position = "top")
    )
  } else {
    scale_fill_gradient(name = x, 
                        guide = guide_colorbar(direction = "horizontal", title.position = "top")
    ) 
  }
})
names(scale_fill) <- cols

plot_fun <- function(x) {
  theme_adjust <- if (x != "Tumor_Stage") {
    theme(
      axis.line.x = element_blank(),
      axis.text.x = element_blank(),
      axis.title.x = element_blank(),
      axis.ticks.x = element_blank(),
      axis.ticks.length.x = unit(0, "pt")
    )  
  }
  
  plot_margin <- if (x == "Tumor_Stage") {
    theme(plot.margin = margin(0, 5.5, 5.5, 5.5))
  } else if (x == "Age") {
    theme(plot.margin = margin(5.5, 5.5, 0, 5.5))
  } else {
    theme(plot.margin = margin(0, 5.5, 0, 5.5))
  }
  
  scale_fill <- scale_fill[[x]]
  ggplot(df, aes(
    x = Case,
    y = x
  )) +
    geom_tile(aes(fill = .data[[x]])) +
    scale_fill +
    scale_y_discrete(expand = c(0, 0)) +
    theme_adjust +
    plot_margin +
    labs(y = NULL)
}

purrr::map(cols, plot_fun) |> 
  wrap_plots(ncol = 1)

Upvotes: 2

Related Questions