Reputation: 81
I'm hoping to have a legend that includes references to all colours, not just the vertical lines, and does not include a title.
I've tried scale_colour_manual and scale_fill_manual and they all either overlap or only show the vertical lines. I would appreciate any suggestions.
Reprex is below, including the custom colour palette.
var1 <- c(head(randu$x,n=12))
var2 <- as.Date(c("2010-01-01","2010-02-01","2010-03-01","2010-04-01","2010-05-01","2010-06-01","2010-07-01","2010-08-01","2010-09-01","2010-10-01","2010-11-01","2010-12-01"))
var3 <- c(tail(randu[which(randu$x + randu$y < 1),]$x,n=12))
var4 <- c(tail(randu[which(randu$x + randu$y < 1),]$y,n=12))
dat <- data.frame(var1,var2,var3,var4)
setDT(dat)
dat$var5 <- dat[,(var3+var4)]
new_dates <- as.Date(c("2010-09-01","2010-05-01"))
cbp2 <- c("#000000", "#56B4E9", "#009E73", "#0072B2", "#D55E00", "#CC79A7")
ggplot()+
geom_bar(data=dat,colour=cbp2[1],fill = cbp2[1],aes(x=var2,y=var5,colour="var4"),stat="identity")+
geom_bar(data=dat,colour=cbp2[2],fill = cbp2[2],aes(x=var2,y=var3,colour="var3"),stat="identity")+
geom_line(data=dat,colour=cbp2[1],aes(x=var2,y=var1))+
geom_vline(data=data.frame(xintercept = new_dates),
aes(xintercept = new_dates,linetype = "Changes", colour="red"),
linetype="dashed",key_glyph = "path")+
scale_color_manual(name = "",
values = c("red",cbp2[2],cbp2[1]),
breaks = c("red",cbp2[2],cbp2[1]),
labels = c("Changes","Var3","Var4"))+
scale_fill_manual(name = "",
values = c(cbp2[2],cbp2[1]),
breaks = c(cbp2[2],cbp2[1]),
labels = c("var3","var4"))+
ylab("")+
xlab("")+
scale_x_date(expand=c(0,0),date_breaks = "3 month", date_labels = "%b %y") +
scale_y_continuous(labels = function(var5) paste0(var5*100, "%"),
limits=c(0,1),
breaks=c(0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1)) +
theme(panel.background = element_blank(),
axis.line = element_line(colour = "#000000"),
axis.text.x = element_text(angle=60, hjust=1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.title.x= (element_text(margin = unit(c(3, 0, 0, 0), "mm"))),
legend.position = "top")
Upvotes: 0
Views: 387
Reputation: 13793
There's quite a lot to unpack here with this one, but I gave it my best shot.
First of all, consider what you are trying to plot here. Normally, it's not a problem to call things var1, var2, var3,...
; however, in this context it's really quite confusing. Consequently, for this solution, I will be re-posting your entire code reworked instead of just the plotting portion for reasons I hope to outline in this answer.
With all that being said, here is my understanding about the nature of the dataset and your desire for the final plot:
var2
in the dataset contains Date
class information, and this is the common x
axis for the entire plot.
var1
contains values that are to be used for the y
values of the geom_line
plot layer
var3
and var4
contain values that are to be used for creation of the stacked barplot which should make up the background of the plot
var5
is a sum of var3 + var4
, and was a device to create the plot. Herein, it will not be useful, given the data analysis we are to do on the dataset and the application of Tidy Data principles.
xintercept
Values for the geom_vline
plot layer are supplied as the two dates new_dates
The OP's question indicates a need for the Legend to be displayed correctly. In this case, we want to indicate:
var3
and var4
geom_line
plot layer. Assume the label will be var1
.Hope all that was correct!
I encourage the OP to consult use of Tidy Data Principles, which will make synthesis of data such as this much more straightforward in the future. Herein, I will apply these principles to the dataset dat
.
First of all, let's handle the bar layer data. Applying Tidy Data principles, we would want to gather together var3
and var4
and create out of them two columns: (1) one for the name of the variable ("var3"
or "var4"
), and (2) one for the value. We will be telling ggplot2
to "stack" bars, so var5
is not needed here: ggplot2
will do that calculation automatically. To gather the columns together, my preference is always to use gather()
from dplyr
and tidyr
:
library(dplyr)
library(tidyr)
library(ggplot2)
library(data.table)
var1 <- c(head(randu$x,n=12))
var2 <- as.Date(c("2010-01-01","2010-02-01","2010-03-01","2010-04-01","2010-05-01","2010-06-01","2010-07-01","2010-08-01","2010-09-01","2010-10-01","2010-11-01","2010-12-01"))
var3 <- c(tail(randu[which(randu$x + randu$y < 1),]$x,n=12))
var4 <- c(tail(randu[which(randu$x + randu$y < 1),]$y,n=12))
dat <- data.frame(var1,var2,var3,var4)
setDT(dat)
# dat$var5 <- dat[,(var3+var4)] no longer needed
new_dates <- as.Date(c("2010-09-01","2010-05-01"))
cbp2 <- c("#000000", "#56B4E9", "#009E73", "#0072B2", "#D55E00", "#CC79A7")
newdat <- dat %>%
gather(key='var_name', value='value', -var2) # gather all columns except for var2
names(newdat) <- c('Dates', 'var_name', 'value')
newdat$var_name <- factor(newdat$var_name, levels=c('var4', 'var3','var1'))
In addition to gathering together, you will also note that I'm adjusting the names of the columns to make them a bit more easier to follow when it comes down to plotting. Additionally, I'm setting the order of the levels for newdat$var_name
. The purpose here is that the order we specify will relate to the ordering used to create the plot. I want var3
to appear as a bar "under" var4
, so we need to specify that var4
is first.
You could also create a separate dataset containing var2
and var1
to use for plotting the geom_line
layer... but this also works fine.
For the plot, I've tried to organize the code into separate sections. What OP was trying to do was to plot column-by-column, rather than using aes(fill=
and aes(color=
to set and create legends. In addition, the OP's original code had numerous examples of the following:
geom_*(aes(color=...), color=...)
The result of this in ggplot2
is that if you set an aesthetic value (like color=
) outside of aes()
while also stating this argument inside aes()
, the value on the outside will overwrite the value specified inside the mapping--effectively removing any call to place that within a legend. This was the biggest cause for issue in the OP's example, and why certain items were the "right" color, but did not appear in any legend.
Specifying arguments in aes()
only indicates that a legend should be created and tells ggplot2
on what basis to apply color, fill, linetype... it does not actually specify the color. Color should be specified using the scale_*_*()
functions. In this case, we have 3 legend types created. The OP can organize however they wish to do so, but I tried to keep this example a bit illustrative to allow for some changing on the OP's case, since it is still not entirely clear how the legend is wanted to look completely.
Note that values=
is used to apply the color, linetype, or fill aesthetic, and is done by feeding that argument a named vector. You can also use a non-named vector, in which case the attributes will be applied according to the ordering of the levels for that factor.
Note that I changed the line color of the geom_line
to blue... just so that it stands out a bit. It would be a bit confusing otherwise, since there is a fill color that is also black.
ggplot(dat, aes(x=Dates, y=value)) +
# plot layers
geom_col(
data=subset(newdat, var_name != 'var1'),
aes(fill=var_name), position='stack') +
geom_line(
data=subset(newdat, var_name == 'var1'),
aes(color=var_name)
) +
geom_vline(data=data.frame(xintercept = new_dates),
aes(xintercept = new_dates, linetype = "Changes"), colour="red",
key_glyph = "path")+
# color and legend settings
scale_fill_manual(
name="Fill",
values=c('var3'=cbp2[2], 'var4'=cbp2[1])) +
scale_color_manual(
name='Color',
values = 'blue') +
scale_linetype_manual(
name='Linetype',
values=2) +
# scale adjustment and theme stuff
scale_y_continuous(labels = function(var5) paste0(var5*100, "%"),
limits=c(0,1),
breaks=c(0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1)) +
theme(panel.background = element_blank(),
axis.line = element_line(colour = "#000000"),
axis.text.x = element_text(angle=60, hjust=1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.title.x= (element_text(margin = unit(c(3, 0, 0, 0), "mm"))),
legend.position = "top")
Upvotes: 2