Reputation: 21
I have a data set consisting of the Year, Credit Hours, and Number of students. I have been trying to predict future credit hours by the number of students.
df <- data.frame("year = c(2018,2019,2020,2021), "student" = c(1000,1200,1350,1450), "credit" = c(4000,4300,4730,4250))
mod <- lm(credit ~ year + student, data = df)
summary(mod)
I would like to predict the number of credit hours for the next couple of years, lets just say 2022:2025, that also factors in predicted number of students. Is there a way to do this?
year | credit | student |
---|---|---|
2018 | 4000 | 1000 |
2019 | 4300 | 1200 |
2020 | 4730 | 1350 |
2021 | 4250 | 1450 |
2022 | NA | NA |
2023 | NA | NA |
2024 | NA | NA |
2025 | NA | NA |
In other words, how can I use a linear model in R to predict all of these NA values? I can do this in a simple linear regression no problem, but cannot seem to get it to work in multiple form.
Upvotes: 0
Views: 307
Reputation: 643
You need to pass a dataframe to predict() as the newdata argument. The data frame requires you to specify values of independent variables for each prediction. If you also want to predict the number of students based on a linear model and use that as input then you can do that step first. Something like:
lm.student <- lm(students ~ year, df)
pred.student <- predict(lm.student, newdata = data.frame(year=2022:2025))
mod <- lm(credit ~ year + student, data = df)
MyNewData <- data.frame (year=2022:2025, student=pred.student)
pred <- predict(mod, newdata = MyNewData)
Upvotes: 0