Porsche9II
Porsche9II

Reputation: 651

Fit 1000 samples to a normal distribution using Maximum-Likelihood in C++

I have a sample of 5000 doubles like

sample = {1.23, -4.67, 0.17, 1.25, 6.89, -2.03, ...}

and want to fit the data to a parametric distributions like N(mu, sigma) or generalized student t(loc, scale, DoF)...

I already have the PDFs of these distributions PDF_normal(mu, sigma)(x) and PDF_t(loc, scale, DoF)(x) and can calculate the sum of the logarithms of the PDFs for the 5000 samples for fixed distribution parameters.

Now, I want to use some C++ algorithms for solving nonlinear optimization problems to find the parameters (mu_max, sigma_max) or (loc_max, scale_max, DoF_max) that will give me the max-log-likelihood values.

The R Project for Statistical Computing is solving the problem with in the MASS package in the following way: .. direct optimization of the log-likelihood is performed using optim. The estimated standard errors are taken from the observed information matrix, calculated by a numerical approximation. For one-dimensional problems the Nelder-Mead method is used and for multidimensional problems the BFGS method...

Unfortunately, I cannot use the R solution but have to come up with a solution in Microsoft VS2010 C++ and I don't want to write the optimization code by myself and I dont want to look at the R source code and rewrite it for C++...

Any suggestions where I can find a fast and well tested implementation of BFGS (or similar) for C++?

Is there anything available in Boost, Intel MKL, etc?

Thanks for your help, Matt

Upvotes: 1

Views: 1430

Answers (2)

tmyklebu
tmyklebu

Reputation: 14205

The distributions you're mentioning all have very few parameters. It might be more useful to do straight Newton's method. You can work out the log-likelihood function and its gradient and Hessian (with respect to the parameters, not the data).

When you're solving the system at the end, you should use Cholesky factorisation or an L D LT factorisation instead of Gaussian elimination. You will know all of the factors with precision roughly sqrt(machine epsilon), which is 10-7-ish on most systems.

A more robust way to get the Hessian's Cholesky factorisation, if it's the sum of a bunch of rank-one matrices, is to incorporate each new rank-one matrix into the Cholesky factorisation using a sequence of Givens rotations. This gets you precision close to machine epsilon, but it's roughly twice as slow as forming the Hessian and then taking its Cholesky factorisation in the usual way.

The technology here is simple enough that you can probably code it yourself.

It's difficult for me to suggest a general-purpose software package for you. This is partly because I haven't had any positive experiences with other people's C++ optimization libraries. Mostly, though, writing up Newton's method isn't much work once you've got code to evaluate the gradient and Hessian, so I haven't really felt the need.

That said, you might want to look and see what the COIN-OR project has to offer. You might be able to save some time using automatic differentiation tools --- (COIN-OR has one). I guess I should mention that I've never used nonlinear optimisation stuff from COIN-OR or an automatic differentiation tool for anything myself.

Upvotes: 0

Porsche9II
Porsche9II

Reputation: 651

Okay, I do not need any optimization for the MLE of the normal distribution because it can be solved in a closed form, see 1: http://de.wikipedia.org/wiki/Maximum-Likelihood-Methode

But I want to solve this problem for different families of distributions where I only know the PDFs. Thus I still need a nice C++ implementation of a nonlinear solver...

Upvotes: 1

Related Questions