Reputation: 457
this is my first question on SO, I hope someone can help me answer it.
I'm reading data from a csv with R
with data<-read.csv("/data.csv")
and get something like:
Group x y size Color
Medium 1 2 2000 yellow
Small -1 2 1000 red
Large 2 -1 4000 green
Other -1 -1 2500 blue
Each group color may vary, they are assigned by a formula when the csv
file is generated, but those are all the possible colors (the number of groups may also vary).
I've been trying to use ggplot()
like so:
data<-read.csv("data.csv")
xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))
data$Color<-as.character(data$Color)
print(data)
ggplot(data, aes(x = x, y = y, label = Group)) +
geom_point(aes(size = size, colour = Group), show.legend = TRUE) +
scale_color_manual(values=c(data$Color)) +
geom_text(size = 4) +
scale_size(range = c(5,15)) +
scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
theme_bw()
Everything is correct except for the colors
I noticed the legend at the right orders the Groups alphabetically (Large, Medium, Other, Small), but the colors stay in the csv
file order.
Here is a screenshot of the plot.
Can anyone tell me what's missing in my code to fix this? other approaches to achieve the same result are welcome.
Upvotes: 24
Views: 36278
Reputation: 1057
I had never heard of R back when this question was answered by @scoa, and I don't know if my solution was available, but you can do what the OP asks with slightly less work using scale_color_identity()
.
library(tidyverse)
data <- tribble(
~Group,~x,~y,~size,~Color,
"Medium",1,2,2000,"yellow",
"Small",-1, 2,1000,"red",
"Large",2,-1,4000,"green",
"Other",-1,-1,2500,"blue")
xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))
ggplot(data, aes(x = x, y = y, label = Group)) +
geom_point(aes(size = size, colour = Color), show.legend = TRUE) + # Set aes(colour = Color) (the column in the dataframe)
scale_color_identity() + # This tells ggplot to use the values explicit in the 'Color' column
geom_text(size = 4) +
scale_size(range = c(5,15)) +
scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
theme_bw()
scale_color_identity()
By using this, you don't need to create the separate named vector that you do with scale_color_manual()
and you can directly use the 'Color' column (note the change in geom_point(aes(colour = Group,...
to geom_point(aes(colour = Color,...
!!!).
Upvotes: 13
Reputation: 19867
One way to do this, as suggested by help("scale_colour_manual")
is to use a named character vector:
col <- as.character(data$Color)
names(col) <- as.character(data$Group)
And then map the values
argument of the scale to this vector
# just showing the relevant line
scale_color_manual(values=col) +
full code
xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))
col <- as.character(data$Color)
names(col) <- as.character(data$Group)
ggplot(data, aes(x = x, y = y, label = Group)) +
geom_point(aes(size = size, colour = Group), show.legend = TRUE) +
scale_color_manual(values=col) +
geom_text(size = 4) +
scale_size(range = c(5,15)) +
scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
theme_bw()
Ouput:
Data
data <- read.table("Group x y size Color
Medium 1 2 2000 yellow
Small -1 2 1000 red
Large 2 -1 4000 green
Other -1 -1 2500 blue",head=TRUE)
Upvotes: 27