user8435999
user8435999

Reputation:

How to use fitdist function (negative binomial)?

I have the following data set

aa <- data.frame("set_up" = c(1,1,1,1,1,1,2,2,2,3,3,3), set = c(1,1,1,2,2,2,1,1,1,3,3,3), mass = c(45,12,34,7,1,433,56,12,54,6,7,8))

I want to find the parameter k of the negative binomial function grouped by set and set_up.

The fitdist(data = aa$mass, distr = "nbinom", method = "mle")$estimate[[1]] gives the value of the k parameter. I want to estimate the k for each group of set_up and set.

Here is the dplyr code for it

library(fitdistrplus)
aak <- aa %>% 
  group_by(set_up, set)%>% 
  summarise(ktotalinf = fitdist(data = aa$mass, distr = "nbinom", method = "mle")$estimate[[1]])%>%
  as.data.frame()

I get an output, but it is the same value repeated for each row. This value of the estimate[[1]] is the same as if all the mass data were pooled (and not grouped). Any suggestions on how to resolve this?

Upvotes: 0

Views: 3390

Answers (1)

IRTFM
IRTFM

Reputation: 263342

You got the answer, but not the reasoning behind it. The magrittr/dplyr mechanism is to create a local environment for the application of each successive function along the chain of %>% passages.

When you gave the fitdistrplus::fitdist function the data argument of aa$mass, you actually went outside of the local environment where the values had been separately grouped by your "set" variable. The is no aa-named entity inside the local environment. There is an entity named . (a period), which gets passed along from function to function, getting altered in some manner at each step. Instead of apply-ing the function to each group, fitdist always got the same argument, which was the entire dataframe. When you change the data argument to mass, the R interpreter first looks inside the local environment and does find a named entity within each group.

Upvotes: 1

Related Questions