lken
lken

Reputation: 29

extending regression lines with geom_line() ggplot2

I used a negative binomial glm to look at vertebrate abundance in seagrass. Because I had an interaction term, I predicted some values for fish abundance. I would like these predicted regression lines to reach the end of the plot space. Right now they are all cutting off at different times, as in the example below:

total<-c(1,0,5,7,9,10,23,45,78,100)
shoots_collected<-c(1,2,3,4,5,6,7,45,67,88)
epi_bio<-c(0.0,11,0.89,1.5,9,5,.04,6,7,.9)
Year<-c(1,1,1,1,2,2,2,2,1,1)
Year<-as.factor(Year)
intertidal<-data.frame(shoots_collected,Year,epi_bio, total)

glm.neg<-glm.nb(total~Year+shoots_collected+epi_bio+shoots_collected*epi_bio, 
data=intertidal)
summary(glm.neg)
abun_shoots2015<-data.frame("shoots_collected"=rep(0:30, rep(5,31)), 
"epi_bio"=rep(c(0,1,2,3,4), 31), "Year"=rep("1", 155))

# then extracted predicted values using:

p2015<-predict(glm.neg, newdata=abun_shoots2015, se.fit=TRUE, type='response')
abun_shoots2015$fit<-p2015$fit
ggplot(intertidal, aes(x=shoots_collected, y=total)) +
scale_x_continuous(limits = c(0, 30))+
scale_y_continuous(limits=c(0,10))+
 geom_point(pch=1)+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==0.0000),], aes(x=shoots_collected, y=fit), col="red")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==1),], aes(x=shoots_collected, y=fit), col="green")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==2),], aes(x=shoots_collected, y=fit), col="blue")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==3),], aes(x=shoots_collected, y=fit), col="yellow")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==4),], aes(x=shoots_collected, y=fit), col="pink")

I was previously using the lines() command, but switched to geom_lines() so I could use fullrange=TRUE but it still did not work. I see that I have some missing values when I try to plot the lines, and I suspect that is why some are being cut off, but I don't know where to go from here.

Upvotes: 1

Views: 1806

Answers (1)

Gavin Simpson
Gavin Simpson

Reputation: 174928

You don't want to use scale_y_continuous etc here as they have the effect of trimming out data entirely that lies beyond the stated limits. Instead you want to limit the range of the plot to show only a portion of the data. This is done with coord_cartesian(), as in:

ggplot(intertidal, aes(x=shoots_collected, y=total)) +
 coord_cartesian(xlim = c(0, 30), ylim = c(0,10)) + ## KEY!
 geom_point(pch=1)+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==0.0000),], 
           aes(x=shoots_collected, y=fit), col="red")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==1),], 
           aes(x=shoots_collected, y=fit), col="green")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==2),], 
           aes(x=shoots_collected, y=fit), col="blue")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==3),], 
           aes(x=shoots_collected, y=fit), col="yellow")+
 geom_line(data=abun_shoots2015[which(abun_shoots2015$epi_bio==4),], 
           aes(x=shoots_collected, y=fit), col="pink")

Also, I feel compelled to add that your plot could be more nicely produced by treating epi_bio as a factor:

ggplot(intertidal, aes(x=shoots_collected, y=total)) +
  coord_cartesian(xlim = c(0, 30), ylim = c(0,10)) + ## KEY!
  geom_point(pch=1) +
  geom_line(data = abun_shoots2015, aes(y = fit, colour = as.factor(epi_bio))) +
  scale_colour_discrete(name = "epi_bio") +
  theme(legend.position = "top")

enter image description here

Upvotes: 1

Related Questions