bill999
bill999

Reputation: 2550

R - ggplot2 - geom_line - Get rid of straight line for missing values

I have data that I am trying to plot. I have several variables that range from the years 1880-2012. I have one observation per year. But sometimes a variable does not have an observation for a number of years. For example, it may have an observation from 1880-1888, but then not from 1889-1955 and then from 1956-2012. I would like ggplot2 + geom_line to not have anything in the missing years (1889-1955). But it connects 1888 and 1956 with a straight line. Is there anything I can do to remove this line? I am using the ggplot function.

Unrelated question, but is there a way to get ggplot to not sort my variable names in the legend alphabetically? I have code like this:

ggplot(dataFrame, aes(Year, value, colour=Name)) + geom_line()

Or to add numbers in front of the variable names (Name1, ..., Name10) to the legend. For example, 1. Name1 2. Name2 ... 10. Name10

Upvotes: 8

Views: 10645

Answers (1)

Mark Nielsen
Mark Nielsen

Reputation: 1001

Here's some sample data to answer your questions, I've added the geom_point() function to make it easier to see which values are in the data:

library(ggplot2)
seed(1234)
dat <- data.frame(Year=rep(2000:2013,5),
            value=rep(1:5,each=14)+rnorm(5*14,0,.5),
            Name=rep(c("Name1","End","First","Name2","Name 3"),each=14))
dat2 <- dat
dat2$value[sample.int(5*14,12)]=NA

dat3 is probably the example of what your data looks like except that I'm treating Year as an integer.

dat3 <- dat2[!is.na(dat2$value),]

# POINTS ARE CONNECTED WITH NO DATA IN BETWEEN #
ggplot(dat3, aes(Year, value, colour=Name)) + 
     geom_line() + geom_point()

However if you add columns in your data for the years that are missing a column and setting that value to NA then when you plot the data you'll get the gaps.

# POINTS ARE NOT CONNECTED #
ggplot(dat2, aes(Year, value, colour=Name)) + 
     geom_line() + geom_point()

And finally, to answer your last question this is how you change the order and labels of Name in the legend:

# CHANGE THE ORDER AND LABELS IN THE LEGEND #
ggplot(dat2, aes(Year, value, colour=Name)) + 
     geom_line() + geom_point() + 
     scale_colour_discrete(labels=c("Beginning","Name 1","Name 2","Name 3","End"),
                             breaks=c("First","Name1","Name2","Name 3","End"))

Upvotes: 11

Related Questions