ATMathew
ATMathew

Reputation: 12856

Using the predict() function with glm

Let's say that I have the following data set and am running a regression model using glm in R. I have the coefficients, but I want to predict "next months" value (visits). How would I go about that in this example.

d <- data.frame(month = c("jan", "feb", "mar", "apr", "may", "june"),
                visit =  c( 1,  2,  4,  8, 16, 32),
                click =  c(64, 62, 36,  5,  6,  3),
                conv =  c(1, 3, 6, 2, 3, 8))
d
dFit <- glm(visit ~ click + conv, data=d)

For July, how can I use the predict() function in R to predict the number of visits (response variable)?

EDIT:

What I'm trying to eventually get is an out put where I have

Mon   Pred_clicks
jan   20
feb   25
mar   21
apr   31
may   15
june  21 
july  50

EDIT 2:

This isn't the output I'd like

> predict(dFit)
        1         2         3         4         5         6 
-3.452974  1.223969 13.533457 12.235771 14.113888 25.345890 

Upvotes: 4

Views: 37685

Answers (2)

musically_ut
musically_ut

Reputation: 34288

Since you trained the model with a data.frame which contained the columns month, click and conv, you will have to provide such a data.frame to predict the values as well:

 predict(dFit, data.frame(month="july", conv=mean(d$conv), click=mean(d$click)))

The mean(d$conv) and mean(d$click) are the predicted values for the respective quantities for the month of July. If you have the actual values of conv and click for the month of July, substitute them in the statement to get your prediction.

However, that is probably not what you are looking for and GLMs regression may not be the best model to for this sort of time series data. I think you would want to use VAR as your predictive model.

Upvotes: 4

David
David

Reputation: 9405

Assuming you have a data frame containing July's data named newdata, you would just do:

predict(dFit,newdata)

If you don't have data for July then it is not possible.

Upvotes: 3

Related Questions