Oscar Mederos
Oscar Mederos

Reputation: 29833

What are some good practices for unit testing probability distributions?

I'm working in a project where I need to generate Poisson, Normal, etc. variables from scratch. I know there are implementations in Python. I'm used to writing tests for almost everything I code.

I'm wondering what would be a good practice (if any) to test those functions?

Upvotes: 4

Views: 2096

Answers (3)

Lior Kogan
Lior Kogan

Reputation: 20618

I assume that your implementation is built on top of a uniform-distribution pseudonumber generator which you trust to be good enough (Not only the distribution of the generated values, but also the randomness of their order - see Diehard tests).

You should build two histograms: The first, based on values generated by your implementation. The second, based on a trusted implementation, or better - based on a maximum-likelihood estimate of the value count in each histogram column of the given distribution.

Next, you can verify that the counts match, for all histogram columns, using a tight confidence interval.

Upvotes: 4

hvgotcodes
hvgotcodes

Reputation: 120198

You could at the very least assert that the returned value is not null and in the range you expect. That still ensures that the methods at least run and don't error out and that they pass a basic sanity check.

You could also gather many values, and assert that you get somewhere close to the expected distribution of values but that would take more work.

Upvotes: 1

DRVic
DRVic

Reputation: 2481

What I've done in similar circumstances is a) write a simple histogram routine that plots a histogram of samples, and run it on a few thousand samples to eyeball it; and b) test some key statistics - standard deviation, mean, ... to see that they behave as they should.

Upvotes: 1

Related Questions