Reputation: 25
I'm struggling to learn the ins and outs of R, ggplot2, etc - being more used to being taught in an A to Z manner an entire (fixed) coding language (not used to open source - I learned to code when dinosaurs roamed the earth). So I have kluged together the following code to create one graph. Only ... I don't have the dupe legends problem -- I have no legend a'tall!
erc <- ggplot(usedcarval, aes(x = usedcarval$age)) +
geom_line(aes(y = usedcarval$dealer), colour = "orange", size = .5) +
geom_point(aes(y = usedcarval$dealer),
show.legend = TRUE, colour = "orange", size = 1) +
geom_line(aes(y = usedcarval$pvtsell), colour = "green", size = .5) +
geom_point(aes(y = usedcarval$pvtsell), colour = "green", size = 1) +
geom_line(aes(y = usedcarval$tradein), colour = "blue", size = .5) +
geom_point(aes(y = usedcarval$tradein), colour = "blue", size = 1) +
geom_line(aes(y = as.integer(predvalt)), colour = "gray", size = 1) +
geom_line(aes(y = as.integer(predvalp)), colour = "gray", size = 1) +
geom_line(aes(y = as.integer(predvald)), colour = "gray", size = 1) +
labs(x = "Value of a Used Car as it Ages (Years)", y = "Dollars") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 60, vjust = .6))
erc
I can't figure out how to put an image in this text since I have no link except to my dropbox...
I would appreciate any help. Sincerely, Stephanie
Upvotes: 1
Views: 3543
Reputation: 22827
If you are looking for a short example of how to take some series data that comes in wide format, convert it to long format (using gather
), and then plot it with a ggplot (with a legend), here is a nice short example I cooked up for someone recently:
library(ggplot2)
library(tidyr)
# womp up some fake news (uhh... data)
x <- seq(-pi,pi,by=0.25)
y <- sin(x)
yhat <- sin(x) + 0.4*rnorm(length(x))
# This is the data in wide form
# you will never get ggplot to make a legend for it
# it simply hates wide data
df1 <- data.frame(x=x,y=y,yhat=yhat)
# So we use gather from tidyr to make it into long data
# creates two new colums, throws y and yhat in them, and replicates x as needed
# you have to look at the data frame to understand gather,
# and read the docs a few times
df2 <- gather(df1,series,value,-x)
# it is now in long form and we can plot it
ggplot(df2) + geom_line(aes(x,value,color=series))
So here is the plot:
Upvotes: 1
Reputation: 22827
Ok, I felt like doing some ggplot, and it was an interesting task to contrast the way ggplot-beginners (I was one not so long ago) approach it compared to the way you need to do it to get things like legends.
Here is the code:
library(ggplot2)
library(gridExtra)
library(tidyr)
# fake up some data
n <- 100
dealer <- 12000 + rnorm(n,0,100)
age <- 10 + rnorm(n,3)
pvtsell <- 10000 + rnorm(n,0,300)
tradein <- 5000 + rnorm(n,0,100)
predvalt <- 6000 + rnorm(n,0,120)
predvalp <- 7000 + rnorm(n,0,100)
predvald <- 8000 + rnorm(n,0,100)
usedcarval <- data.frame(dealer=dealer,age=age,pvtsell=pvtsell,tradein=tradein,
predvalt=predvalt,predvalp=predvalp,predvald=predvald)
# The ggplot-naive way
erc <- ggplot(usedcarval, aes(x = usedcarval$age)) +
geom_line(aes(y = usedcarval$dealer), colour = "orange", size = .5) +
geom_point(aes(y = usedcarval$dealer),
show.legend = TRUE, colour = "orange", size = 1) +
geom_line(aes(y = usedcarval$pvtsell), colour = "green", size = .5) +
geom_point(aes(y = usedcarval$pvtsell), colour = "green", size = 1) +
geom_line(aes(y = usedcarval$tradein), colour = "blue", size = .5) +
geom_point(aes(y = usedcarval$tradein), colour = "blue", size = 1) +
geom_line(aes(y = as.integer(predvalt)), colour = "gray", size = 1) +
geom_line(aes(y = as.integer(predvalp)), colour = "gray", size = 1) +
geom_line(aes(y = as.integer(predvald)), colour = "gray", size = 1) +
labs(x = "ggplot naive way - Value of a Used Car as it Ages (Years)", y = "Dollars") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 60, vjust = .6))
# The tidyverse way
# ggplot needs long data, not wide data.
# Also we have two different sets of data for points and lines
gdf <- usedcarval %>% gather(series,value,-age)
pdf <- gdf %>% filter( series %in% c("dealer","pvtsell","tradein"))
# our color and size lookup tables
clrs = c("dealer"="orange","pvtsell"="green","tradein"="blue","predvalt"="gray","predvalp"="gray","predvald"="gray")
szes = c("dealer"=0.5,"pvtsell"=0.0,"tradein"=0.5,"predvalt"=1,"predvalp"=1,"predvald"=1)
trc <- ggplot(gdf,aes(x=age)) + geom_line(aes(y=value,color=series,size=series)) +
scale_color_manual(values=clrs) +
scale_size_manual(values=szes) +
geom_point(data=pdf,aes(x=age,y=value,color=series),size=1) +
labs(x = "tidyverse way - Value of a Used Car as it Ages (Years)", y = "Dollars") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.text.x = element_text(angle = 60, vjust = .6))
grid.arrange(erc, trc, ncol=1)
Study it, espeically look at gdf
,pdf
and gather
. You just can't get legends without using "long data".
If you want more information on the "tidyverse", start here: Hadley Wickham's tidyverse
Upvotes: 2