Adding Data from predict() values to end of another plot in R

Question

I have a csv file that contains estimates of the population from 2010-2019. I've used the predict() function to estimate the population from 2020 to 2024. How would I combine these two plots to where 2020 starts where 2019 left off on the x- axis? Would the function ggarrange be the best option?

Also, how would I change the x-tick marks to show at 2020, 2021,2022,2023,2024? It currently just shows 1,2,3,4,5. I tried the scale_x_discrete function but to no avail.

library(ggplot2)
library(tidyr)
library(tidyverse)

pops <- read_csv("nst-est2019-popchg2010_2019.csv")
OK_pops<- filter(pops, NAME == "Oklahoma")
pop_OK <- pivot_longer(OK_pops,
        cols=starts_with("POP"),
        names_to="Year",
        names_prefix = "POPESTIMATE",
        values_to = "Population"
)

options(digits=4)
pop_OK <- transform(pop_OK, Population=as.numeric(Population))
pop_OK <- transform(pop_OK, Year=as.numeric(Year))

str(pop_OK)

ggplot(pop_OK) + geom_point(aes(x=Year, y=Population))
abline(pop_OK)


model <-lm(formula = Population ~ Year, data = pop_OK)
summary(model)
pred <- predict(model, newdata=data.frame(Year=2020:2024))
setNames(pred, 2020:2024)

plot(pred, pch = 16, col = "blue" )
scale_x_discrete(breaks=c("1", "2", "3", "4", "5"),
                  labels=c("2020","2021","2022","2023","2024"))

Sally_ar · Accepted Answer

you need to use rbind similar to this:

new_data <- rbind(pop_ok, pred$fit)

You need to realize that the predict function has three columns of fit, lwr (lower) and upr (upper) as output. If you grab the fit column then you are loosing the upper and lower confidence intervals.

Hope this helps.

Adding Data from predict() values to end of another plot in R

Answers (1)

Related Questions