Reputation: 59
I was hoping to get some advice on a GLM with Poisson family.
I have a data set with the number of bites taken by each individual over a period of time. As the individuals observed were feeding for different periods of time, when I calculated the bite rate for each individual as bites/minute, I have ended up with non-integer numbers. Now, based on what I have read so far, I should still be able to do a GLM with a Poisson family. However, I am encountering errors and I think this may be because R is not liking the fact I'm using non-integers. Does anyone have any advice?
Example <- structure(list(Species = c("Fish1", "Fish2", "Fish3", "Fish4",
"Fish5", "Fish6", "Fish7", "Fish1", "Fish2", "Fish3", "Fish4",
"Fish5", "Fish6", "Fish7", "Fish1", "Fish2", "Fish3", "Fish4",
"Fish5", "Fish6", "Fish7"), Site = c(1, 1, 1, 1, 1, 1, 1, 2,
2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3), Bite_Rate = c(3.5, 7.5,
0, 0, 2.45, 5.5, 6.5, 6.5, 7.5, 8.03, 32.1, 15.6, 18.2, 19.1,
20.5, 20.5, 3.5, 5.7, 6.7, 23.2, 0)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -21L), spec = structure(list(
cols = list(Species = structure(list(), class = c("collector_character",
"collector")), Site = structure(list(), class = c("collector_double",
"collector")), Bite_Rate = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
str(Example) # check structure
Example$Species<-as.factor(Example$Species) # set species as a factor
str(Example) # check structure
glm<-glm(Species~Bite_Rate, data=Example, family = poisson) # create the GLM
The error message when I run the GLM is:
Error in if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In Ops.factor(y, 0) : ‘<’ not meaningful for factors
I don't actually have any negative values which is what throws me off a little bit. Any advice would be much appreciated!
EDIT: Based on the comments, I have updated my example data so that it has counts of bites and the time of observation in seconds
Example <- structure(list(Species = structure(c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("Fish1",
"Fish2", "Fish3", "Fish4", "Fish5", "Fish6", "Fish7"), class = "factor"),
Site = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 3), Bites = c(0, 10, 18, 17, 6, 0, 1, 0, 19,
12, 7, 3, 5, 1, 5, 0, 10, 18, 17, 7, 25), Observed_Seconds = c(50,
33, 47, 20, 17, 10, 14, 21, 48, 10, 50, 33, 47, 20, 17, 10,
14, 21, 48, 10, 90)), row.names = c(NA, -21L), spec = structure(list(
cols = list(Species = structure(list(), class = c("collector_character",
"collector")), Site = structure(list(), class = c("collector_double",
"collector")), Bites = structure(list(), class = c("collector_double",
"collector")), Observed_Seconds = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
Thank you!
Upvotes: 1
Views: 2424
Reputation: 46888
You need to get the seconds (denominator) with which you normalized, and the actual bites (counts).
Next include the minutes as an offset, and note, your response variable is on the left hand side of ~ :
fit = glm(Bites ~ Species,offset=log(Observed_Seconds),
family=poisson,data=Example)
You can look at the summary:
summary(fit)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
SpeciesFish1 -2.8679 0.4472 -6.413 1.42e-10 ***
SpeciesFish2 -1.1436 0.1857 -6.158 7.35e-10 ***
SpeciesFish3 -0.5738 0.1581 -3.629 0.000284 ***
SpeciesFish4 -0.7732 0.1543 -5.011 5.42e-07 ***
SpeciesFish5 -1.3269 0.1961 -6.766 1.33e-11 ***
SpeciesFish6 -1.7198 0.2887 -5.958 2.56e-09 ***
SpeciesFish7 -1.5244 0.1925 -7.921 2.35e-15 ***
Seems very significant, but ne good thing to check is whether the data is overdispersed, and also include other factors (e.g Site):
fit_quasi = glm(Bites ~ Species + factor(Site),offset=log(Observed_Seconds),
family=quasipoisson,data=Example)
summary(fit_quasi)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.9754 1.0713 -2.777 0.0167 *
SpeciesFish2 1.8487 1.1434 1.617 0.1319
SpeciesFish3 2.2731 1.1152 2.038 0.0642 .
SpeciesFish4 2.1246 1.1205 1.896 0.0823 .
SpeciesFish5 1.3533 1.1604 1.166 0.2662
SpeciesFish6 1.2754 1.2658 1.008 0.3336
SpeciesFish7 0.8922 1.1719 0.761 0.4612
factor(Site)2 -0.2325 0.5132 -0.453 0.6587
factor(Site)3 0.6118 0.4677 1.308 0.2154
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for quasipoisson family taken to be 5.521045)
If it follows a poisson, the dispersion will be around 1, but in this case, it is overdispersed.
Upvotes: 2