Reputation: 101
How to show loess smoothed trend-line in the plot? Please help to handle the warning message: "Removed 19 rows containing non-finite values (stat_smooth)".
My data:
yrcnt<-read.table(header = TRUE, text = "year outcome pop rate pred.SC
1 1995 2306 87592001 2.632660 0.9626214
2 1996 2221 87628543 2.534562 0.9599941
3 1997 2202 81872629 2.689544 0.9573667
4 1998 2316 88200076 2.625848 0.9547394
5 1999 2456 96200312 2.553006 0.9521121
6 2000 2526 99565063 2.537035 0.9494848
7 2001 2511 95951330 2.616952 0.9468575
8 2002 2537 96976191 2.616106 0.9442302
9 2003 2618 101673130 2.574918 0.9416028
10 2004 2644 104554479 2.528825 0.9389755
11 2005 2594 100522055 2.580528 0.9363482
12 2006 2620 105787278 2.476668 0.9337209
13 2007 2722 108946407 2.498476 0.9310936
14 2008 2788 112200567 2.484836 0.9284663
15 2009 2706 104491560 2.589683 0.9258389
16 2010 2773 108651896 2.552187 0.9232116
17 2011 2764 109632577 2.521148 0.9205843
18 2012 2694 107594922 2.503836 0.9179570
19 2013 2673 107553219 2.485281 0.9153297")
http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html
My code:
p1 <- ggplot(yrcnt, aes(y = log(rate), x = year))
yrcnt$pred.SC <- predict(lm(year ~ log(rate), data = yrcnt))
p1 + geom_line(aes(color = rate)) +geom_line(aes(y = pred.SC))
p1 + geom_line(aes(color = rate)) + geom_smooth()
p4 <- p1 + geom_line(aes(color = rate)) + geom_smooth(color="red")
p4 + scale_x_continuous(name = "Years",limits = c(1995, 2013),breaks = 1994:2014) +
scale_y_continuous(name = "Pancreatic Cancer Hospitalization Rate, 1995-2013",limits = c(2.4, 2.7),breaks = seq(2.4, 2.7, by = 0.1)) +
ggtitle("Long Term Trend in Pancreatic Cancer Hospitalizations")
>`geom_smooth()` using method = 'loess'
p4=Base plot with trendline
Scale plot that was failed to get incorporated to base plot=p4
Upvotes: 0
Views: 1824
Reputation: 9295
In the function call to scale_y_continuous()
, remove the arguments
limits = c(2.4, 2.7),
breaks = seq(2.4, 2.7,
by = 0.1))
Because the true limits of the y-axis are between 0.9 and 1, and you are setting them to a range between 2.4 and 2.7 . I don't know if you need the rate or the log(rate) here.
An alternative would be
library('ggplot2')
p1 <- ggplot(yrcnt, aes(y = rate, x = year))
######### lm() args flipped, then
######### wrapped in exp() function.
yrcnt$pred.SC <- exp(predict(lm( log(rate) ~ year, data = yrcnt)))
p1 + geom_line(aes(color = rate)) +geom_line(aes(y = pred.SC))
p4 <- p1 + geom_line(aes(color = rate)) + geom_smooth(color="red", method="loess")
p4 + scale_x_continuous(name = "Years",limits = c(1995, 2013),breaks = 1994:2014) +
scale_y_continuous(name = "Pancreatic Cancer Hospitalization Rate, 1995-2013",limits = c(2.4, 2.7),breaks = seq(2.4, 2.7, by = 0.1)) +
theme(legend.position ="none") +
ggtitle("Long Term Trend in Pancreatic Cancer Hospitalizations")
Upvotes: 1