Reputation: 2550
I have data that I am trying to plot. I have several variables that range from the years 1880-2012. I have one observation per year. But sometimes a variable does not have an observation for a number of years. For example, it may have an observation from 1880-1888, but then not from 1889-1955 and then from 1956-2012. I would like ggplot2 + geom_line to not have anything in the missing years (1889-1955). But it connects 1888 and 1956 with a straight line. Is there anything I can do to remove this line? I am using the ggplot function.
Unrelated question, but is there a way to get ggplot to not sort my variable names in the legend alphabetically? I have code like this:
ggplot(dataFrame, aes(Year, value, colour=Name)) + geom_line()
Or to add numbers in front of the variable names (Name1, ..., Name10) to the legend. For example, 1. Name1 2. Name2 ... 10. Name10
Upvotes: 8
Views: 10645
Reputation: 1001
Here's some sample data to answer your questions, I've added the geom_point()
function to make it easier to see which value
s are in the data:
library(ggplot2)
seed(1234)
dat <- data.frame(Year=rep(2000:2013,5),
value=rep(1:5,each=14)+rnorm(5*14,0,.5),
Name=rep(c("Name1","End","First","Name2","Name 3"),each=14))
dat2 <- dat
dat2$value[sample.int(5*14,12)]=NA
dat3
is probably the example of what your data looks like except that I'm treating Year
as an integer.
dat3 <- dat2[!is.na(dat2$value),]
# POINTS ARE CONNECTED WITH NO DATA IN BETWEEN #
ggplot(dat3, aes(Year, value, colour=Name)) +
geom_line() + geom_point()
However if you add columns in your data for the years that are missing a column and setting that value to NA
then when you plot the data you'll get the gaps.
# POINTS ARE NOT CONNECTED #
ggplot(dat2, aes(Year, value, colour=Name)) +
geom_line() + geom_point()
And finally, to answer your last question this is how you change the order and labels of Name
in the legend:
# CHANGE THE ORDER AND LABELS IN THE LEGEND #
ggplot(dat2, aes(Year, value, colour=Name)) +
geom_line() + geom_point() +
scale_colour_discrete(labels=c("Beginning","Name 1","Name 2","Name 3","End"),
breaks=c("First","Name1","Name2","Name 3","End"))
Upvotes: 11