p-robot
p-robot

Reputation: 4904

Use custom color AND custom label from dataframe columns in ggplot2

I would like ggplot to take the color and label aesthetics from the dataframe that's being plotted. I understand you can use scale_colour_identity() to take color from the data.frame (the col column below) but I'm not sure how to get it to read custom labels too (the lab column below).

Note: I appreciate you can do these things by hard-coding the labels and colours the script (the bunch of manual commands in ggplot). However, I want to avoid hard-coding these things in the R script (which seems like bad coding practice to me) because it makes it more difficult for me to then use the same plotting metadata in a Python script (or another R script), for instance. I appreciate you could also put colours in an R script and source() them but then the same issue arises in trying to use consistent colours in Python plots.

require(ggplot2)

set.seed(42)

# Define test dataset
n_trt <- 3
n_samples <- 5
cols <- c("#F2762E", "#F2AB27", "#4F8C11")

df <- data.frame(
    trt = rep(LETTERS[1:n_trt], each = n_samples),
    response_x = rep(rnorm(n_trt), each = n_samples) + runif(n_trt*n_samples),
    response_y = rep(rnorm(n_trt), each = n_samples) + runif(n_trt*n_samples),
    col = rep(cols, each = n_samples),
    lab = rep(paste0(LETTERS[1:n_trt], letters[1:n_trt]), each = n_samples))

#> print(head(df))
#  trt response_x response_y     col lab
#1   A  2.1075468 -0.1803944 #F2762E  Aa
#2   A  1.5056250 -0.6391629 #F2762E  Aa
#3   A  2.0279507 -0.2501283 #F2762E  Aa
#4   A  2.0760232 -0.3485370 #F2762E  Aa
#5   A  1.8287002 -0.2750774 #F2762E  Aa
#6   B  0.1544141  2.0014811 #F2AB27  Bb

# Plotting routine
p <- ggplot(df, aes(x = response_x, y = response_y, 
        color = col, labels = lab, group = lab)) + 
    geom_point(size = 2.5) + 
    scale_colour_identity(guide = "legend", aes(labels = lab)) + 
    theme_bw() + 
    theme(legend.position = "top")

ggsave("test.png", p, width = 6, height = 4)

test.png

Upvotes: 1

Views: 1697

Answers (3)

stefan
stefan

Reputation: 125373

Instead of making use of unique() you could pass your desired colors and labels as a named vector to scale_color_manual which ensures that colors are assigned to the right categories. As you have colors and labels inside a dataframe your could make use of dplyr::distinct and tibble::deframe to make named vectors like so:

require(ggplot2)
library(dplyr)

cols <- dplyr::distinct(df, trt, col) %>% tibble::deframe()
cols
#>         A         B         C 
#> "#F2762E" "#F2AB27" "#4F8C11"
labs <- dplyr::distinct(df, trt, lab) %>% tibble::deframe()
labs
#>    A    B    C 
#> "Aa" "Bb" "Cc"

# Plotting routine
ggplot(df, aes(x = response_x, y = response_y, color = trt)) + 
  geom_point(size = 2.5) + 
  scale_colour_manual(values = cols, labels = labs) + 
  theme_bw() + 
  theme(legend.position = "top")

Upvotes: 2

p-robot
p-robot

Reputation: 4904

One can use unique() with the _manual_ suite of functions but this relies on ordering behaviour of unique() being consistent so not ideal.

df$col <- as.character(df$col)
df$lab <- as.character(df$lab)

# Plotting routine
p <- ggplot(df, aes(x = response_x, y = response_y, group = lab)) + 
    geom_point(aes(color = lab), size = 2.5) + 
    theme_bw() + 
    scale_color_manual(name = "", values = unique(df$col), labels = unique(df$lab)) + 
    theme(legend.position = "top")

ggsave("test.png", p, width = 6, height = 4)

test.png

Upvotes: 2

Ray
Ray

Reputation: 2288

enter image description here

Control your labels with a separate layer.
To ensure the labels do not overlap with your points, you can offset them with nudge_x or nudge_y.

If you run into issues with overlapping labels, read up on geom_text_repel().

p <- ggplot(df, aes(x = response_x, y = response_y, 
                    color = col, labels = lab, group = lab)) + 
    geom_point(size = 2.5) + 
    geom_text(aes(label = lab), nudge_x = 0.2) +
    scale_colour_identity(guide = "legend", aes(labels = lab)) + 
    theme_bw() + 
    theme(legend.position = "top")
p

Upvotes: 0

Related Questions