sh_student
sh_student

Reputation: 389

ggplot - Set colour of lines depending on a variable with a changing presence of the variable type within the data

Assume the following data frame:

mydf <- data.frame(date = as.Date(rep(c('2019-11-01', '2019-10-01'), 2)), 
                    value = c(10, 15, 8, 4),
                    type = c('Type 1', 'Type 1', 'Type 2', 'Type 2'))

print(mydf)
        date value   type
1 2019-11-01    10 Type 1
2 2019-10-01    15 Type 1
3 2019-11-01     8 Type 2
4 2019-10-01     4 Type 2

I want to create an automated code which creates a line plot for each type and define the colours of each line. Generally, I know how to do that:

require(ggplot2)
myplot <- ggplot(mydf, aes(x = date, y = value, colour = type)) + geom_line() +
  scale_color_manual(name = 'Type', values=c('blue', 'red'))

However, the data frame might be changing when running the code in another month. There might be a Type 3 within the data frame:

mydf <- data.frame(date = as.Date(rep(c('2019-11-01', '2019-10-01'), 3)), 
                    value = c(10, 15, 8, 4, 12, 8),
                    type = c('Type 1', 'Type 1', 'Type 2', 'Type 2', 'Type 3', 'Type 3'))

print(mydf)
     date    value  type
1 2019-11-01    10 Type 1
2 2019-10-01    15 Type 1
3 2019-11-01     8 Type 2
4 2019-10-01     4 Type 2
5 2019-11-01    12 Type 3
6 2019-10-01     8 Type 3

And in yet another month Type 1 or Type 2 might not be in the data:

mydf <- data.frame(date = as.Date(rep(c('2019-11-01', '2019-10-01'), 2)), 
                    value = c(10, 15, 8, 4),
                    type = c('Type 1', 'Type 1', 'Type 3', 'Type 3'))

print(mydf)
        date value   type
1 2019-11-01    10 Type 1
2 2019-10-01    15 Type 1
3 2019-11-01     8 Type 3
4 2019-10-01     4 Type 3

How can I set the colours for Type 1, Type 2 and Type 3 and then variably use the respective defined colours depending on which Type is present in the data. So I can pre-define the colours and just run the script with the new data without needing to change anything within my code (assume Type 1 should be blue, Type 2 should be red and Type 3 should be black for each plot of the three data frames). Thanks!

Upvotes: 0

Views: 1405

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388862

The values parameter can take a named vector to assign values to respective Type.

library(ggplot2)

cols <- c('Type 1' = 'blue', 'Type 2' = 'red', 'Type 3' = 'black')

ggplot(mydf, aes(x = date, y = value, colour = type)) + geom_line() +
  scale_color_manual(name = 'Type',values= cols)

so when you have data with all types present, it looks

mydf <- data.frame(date = as.Date(rep(c('2019-11-01', '2019-10-01'), 3)), 
             value = c(10, 15, 8, 4, 12, 8),
             type = c('Type 1', 'Type 1', 'Type 2', 'Type 2', 'Type 3', 'Type 3'))

enter image description here

and when you have some types absent, it still uses the same colors with same code.

mydf <- data.frame(date = as.Date(rep(c('2019-11-01', '2019-10-01'), 2)), 
                value = c(10, 15, 8, 4),
                type = c('Type 1', 'Type 1', 'Type 3', 'Type 3'))

enter image description here

Upvotes: 5

Related Questions