Maggie
Maggie

Reputation: 101

Show loess smoothed trend-line in line plot, #ggplot2.

How to show loess smoothed trend-line in the plot? Please help to handle the warning message: "Removed 19 rows containing non-finite values (stat_smooth)".

My data:

yrcnt<-read.table(header = TRUE, text = "year outcome pop rate pred.SC
1  1995    2306  87592001 2.632660 0.9626214
2  1996    2221  87628543 2.534562 0.9599941
3  1997    2202  81872629 2.689544 0.9573667
4  1998    2316  88200076 2.625848 0.9547394
5  1999    2456  96200312 2.553006 0.9521121
6  2000    2526  99565063 2.537035 0.9494848
7  2001    2511  95951330 2.616952 0.9468575
8  2002    2537  96976191 2.616106 0.9442302
9  2003    2618 101673130 2.574918 0.9416028
10 2004    2644 104554479 2.528825 0.9389755
11 2005    2594 100522055 2.580528 0.9363482
12 2006    2620 105787278 2.476668 0.9337209
13 2007    2722 108946407 2.498476 0.9310936
14 2008    2788 112200567 2.484836 0.9284663
15 2009    2706 104491560 2.589683 0.9258389
16 2010    2773 108651896 2.552187 0.9232116
17 2011    2764 109632577 2.521148 0.9205843
18 2012    2694 107594922 2.503836 0.9179570
19 2013    2673 107553219 2.485281 0.9153297")

http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html 

My code:

    p1 <- ggplot(yrcnt, aes(y = log(rate), x = year))
    yrcnt$pred.SC <- predict(lm(year ~ log(rate), data = yrcnt))
    p1 + geom_line(aes(color = rate)) +geom_line(aes(y = pred.SC))
    p1 + geom_line(aes(color = rate)) + geom_smooth()
    p4 <- p1 + geom_line(aes(color = rate)) + geom_smooth(color="red")
    p4 + scale_x_continuous(name = "Years",limits = c(1995, 2013),breaks = 1994:2014) +
      scale_y_continuous(name = "Pancreatic Cancer Hospitalization Rate, 1995-2013",limits = c(2.4, 2.7),breaks = seq(2.4, 2.7, by = 0.1)) +
      ggtitle("Long Term Trend in Pancreatic Cancer Hospitalizations")
>`geom_smooth()` using method = 'loess'
p4=Base plot with trendline

p4=Base plot with trendline

Scale plot that was failed to get incorporated to base plot=p4

Scale plot that was failed to get incorporated to base plot=p4

Upvotes: 0

Views: 1824

Answers (1)

knb
knb

Reputation: 9295

In the function call to scale_y_continuous(), remove the arguments

                       limits = c(2.4, 2.7), 
                       breaks = seq(2.4, 2.7, 
                                    by = 0.1)) 

Because the true limits of the y-axis are between 0.9 and 1, and you are setting them to a range between 2.4 and 2.7 . I don't know if you need the rate or the log(rate) here.

An alternative would be

library('ggplot2')
p1 <- ggplot(yrcnt, aes(y = rate, x = year))

######### lm() args flipped, then 
######### wrapped in exp() function.
yrcnt$pred.SC <- exp(predict(lm( log(rate) ~ year, data = yrcnt))) 

p1 + geom_line(aes(color = rate)) +geom_line(aes(y = pred.SC))

p4 <- p1 + geom_line(aes(color = rate)) + geom_smooth(color="red", method="loess")
p4 + scale_x_continuous(name = "Years",limits = c(1995, 2013),breaks = 1994:2014) +
        scale_y_continuous(name = "Pancreatic Cancer Hospitalization Rate, 1995-2013",limits = c(2.4, 2.7),breaks = seq(2.4, 2.7, by = 0.1)) +
        theme(legend.position ="none") +
        ggtitle("Long Term Trend in Pancreatic Cancer Hospitalizations")

Upvotes: 1

Related Questions