Reputation: 709
I am using ggplot2
to draw some lines. I would like to change the labels. My data has two variables, x1 and x2.
The question is, how can I assign the labels in the correct order to x1 and x2, so that a certain label is assigned to x1 and another one is assigned to x2, and not the other way around. For instance, I would like to assign "AAAA" as label to x1 and "BBBB" as label to x2, and NOT "BBBB" to x1 and "AAAA" to x2. The following example shows what I mean:
set.seed(1)
test <- data.table(x = rnorm(29*2),var=c(rep("x1",29),rep("x2",29)),
time=rep(seq(as.Date("1983/12/31"),as.Date("2011/12/31"), "year"),2))
library(ggplot2);library(scales)
ggplot(data=test, aes(x=time, y=x, colour=var)) +
geom_line() +
scale_color_manual(labels = c("AAAA","BBBB"),values=c("blue","red"))
I am pretty sure that in the above example "AAAA" is assigned to x1, because x1 comes first in the data. But I am not always sure which variable comes first. Is there any better way for a more direct assignment? Or how to keep control?
Thanks for any hints.
Upvotes: 1
Views: 210
Reputation: 922
Just to offer you an alternative to Dave's answer. You can also use named vectors for both labels and colors, using the variables' names as names for objects in the vectors.
The advantage of this approach is that you do not need to modify your database (which is always risky, controversial and prone to errors) but you get full control over ggplot's representation in a simple and highly readable way.
With this approach, your code would look as follows (notice that I'm just tweaking your code a little bit):
library(ggplot2)
library(scales)
library(data.table)
set.seed(1)
test <- data.table(x = rnorm(29*2),var=c(rep("x1",29),rep("x2",29)),
time=rep(seq(as.Date("1983/12/31"),as.Date("2011/12/31"), "year"),2))
#Declaring named vector of labels 'plabels'
plabels <- c('x1' = "AAAA",
'x2' = "BBBB")
#Declaring named vector of colors 'pcolors'
pcolors <- c('x1' = "green",
'x2' = "blue")
#Plotting
ggplot(data=test, aes(x=time, y=x, colour=var)) +
geom_line() +
scale_color_manual(labels = plabels, values=pcolors)
Resulting in:
Upvotes: 2
Reputation: 359
Without the scale_color_manual
you'll have different colors automatically assigned to each one of the variables included.
I think that what you should do, is to change the values of the variable that you want to put "new labels".
This work for you?:
test$var <- as.factor(test$var) # It's a categorical variable.
levels(test$var) <- c("AAAA","BBBB") # We change x1 and x2 by AAAA and BBBB
ggplot(data=test, aes(x=time, y=x, colour=var)) +
geom_line()
From now on, all your plots that use var
will have x1
as AAAA
and x2
as BBBB
.
On the other hand, if you want to force this changes without manually looking at the code (because you don't want to be unlucky because of the order of the values in the column), I suggest you to have a table where each row has the original value and the value that you would have, as a dictionary. (In my example I'm creating it in the code transf_vals
, but it could be an external table)
Then, use this and not what was exposed before:
transf_vals = data.frame("original" = c("x1", "x2"), "new" = c("AAAA","BBBB")) #This could be a .csv or excel file
test$var <- sapply(test$var, FUN = function(x){
transf_vals$new[which(transf_vals$original == x)]
})
ggplot(data=test, aes(x=time, y=x, colour=var)) +
geom_line()
With sapply
I do the next thing:
test$var
transf_vals
transf_vals
Upvotes: 1