Bigchao
Bigchao

Reputation: 1756

About negative binomial regression in large data

I have a data about 19 columns and more than 10 million rows. Now I want to run negative binomial regression.

Since the memory is the bottleneck, I planed to use ff package to deal with the issue. But it turned out that the function glm.nb in MASS package cannot be used in this case. And there's a ffbase package, which have some enhanced functions, but without glm.nb.

Alsobigmemory and biganalyticspackages have such problems.

I don't know whether my understanding is correct. Or there's indeed a feasible way to incorporate ff and MASS. So how to proceed in the next?

PS, I use windows...which seems to be a curse dealing with such large data..

Any link, comments, or tips are appreciated!

Upvotes: 1

Views: 557

Answers (1)

Spacedman
Spacedman

Reputation: 94277

Take a random sample of your data points. Do the analysis. Repeat. Estimate the variance due to this monte-carlo process. If your resulting parameters are still significantly non-zero then stop.

Upvotes: 4

Related Questions