Reputation: 159
I'm trying to find the predicted values of car accidents according to age and sex and finally adjusted to population.
My data is (df):
df <- dplyr::tibble(
city = c("a", "a", "b", "b", "c", "c"),
sex = c(1,0,1,0,1,0),
age = c(1,2,1,2,1,2),
population = c(100, 123, 189, 234, 221, 435),
accidents = c(87, 98, 79, 43,45,65)
)
My code:
library(tidyverse)
library(ggeffects)
poisson<-glm(accidents~sex+age,family="poisson",data=df)
df<-df%>%
mutate(acc_pred=predict(poisson))
Output:
city sex age population accidents acc_pred
a 1 1 100 87 4.36
a 0 2 123 98 4.43
b 1 1 189 79 4.21
b 0 2 234 43 4.25
c 1 1 221 45 4.26
c 0 2 435 65 3.93
What am I doing wrong?
Upvotes: 0
Views: 483
Reputation: 174506
A Poisson glm uses a log link function, and by default the predict.glm
method returns the predictions without applying the inverse link function. You either need to use type = "response"
inside predict
, which will call the inverse link function on the predictions to give you predictions in the same units as your input data, or equivalently, since the inverse link function is essentially just exp
, you can exponentiate the results of predict
.
So you can do either:
df %>%
mutate(acc_pred=predict(poisson, type = 'response'))
#> city sex age population accidents acc_pred
#> 1 a 1 1 100 87 70.33333
#> 2 a 0 2 123 98 68.66667
#> 3 b 1 1 189 79 70.33333
#> 4 b 0 2 234 43 68.66667
#> 5 c 1 1 221 45 70.33333
#> 6 c 0 2 435 65 68.66667
Or
df %>%
mutate(acc_pred = exp(predict(poisson)))
#> city sex age population accidents acc_pred
#> 1 a 1 1 100 87 70.33333
#> 2 a 0 2 123 98 68.66667
#> 3 b 1 1 189 79 70.33333
#> 4 b 0 2 234 43 68.66667
#> 5 c 1 1 221 45 70.33333
#> 6 c 0 2 435 65 68.66667
Upvotes: 5