beldaz
beldaz

Reputation: 4471

Cholesky decomposition generating NaNs in Java

I'm not sure whether this is a maths.se or a SO question, but I'm going with SO as I think it's related to my Java.

I'm following a text book on Gaussian Processes (R&W) and implementing some examples in Java. One common step for several examples is to generate a Cholesky decomposition of a covariance matrix. In my attempt I can get successful results for matrices up to a limited size (33x33). However, for anything larger a NaN appears in the diagonal (at 32,32) and so all subsequent values in the matrix are likewise NaNs.

The code is shown below, and the source of the NaN is indicated in the cholesky method. Essentially the covariance element a[32][32] is 1.0, but the value of sum is a little over this (1.0000001423291431), so the square root is imaginary. So my questions are:

  1. Is this an expected result from linear algebra, or, e.g., an artefact of my implementation?
  2. How is this problem best avoided in practice?

Note that I'm not looking for recommendations of libraries to use. This is simply for my own understanding.

Apologies for the length, but I've tried to provide a complete MWE:

import static org.junit.Assert.assertFalse;

import org.junit.Test;

public class CholeskyTest {

    @Test
    public void testCovCholesky() {
        final int n = 34; // Test passes for n<34
        final double[] xData = getSpread(-5, 5, n);
        double[][] cov = covarianceSE(xData);
        double[][] lower = cholesky(cov);
        for(int i=0; i<n; ++i) {
            for(int j=0; j<n; ++j) {
                assertFalse("NaN at " + i + "," + j, Double.isNaN(lower[i][j]));
            }
        }
    }

    /**
     * Generate n evenly space values from min to max inclusive
     */
    private static double[] getSpread(final double min, final double max, final int n) {
        final double[] values = new double[n];
        final double delta = (max - min)/(n - 1);
        for(int i=0; i<n; ++i) {
            values[i] = min + i*delta;
        }
        return values;
    }

    /**
     * Calculate the covariance matrix for the given observations using
     * the squared exponential (SE) covariance function.
     */
    private static double[][] covarianceSE (double[] v) {
        final int m = v.length;
        double[][] k = new double[m][];
        for(int i=0; i<m; ++i) {
            double vi = v[i];
            double row[] = new double[m];
            for(int j=0; j<m; ++j) {
                double dist = vi - v[j];
                row[j] = Math.exp(-0.5*dist*dist);
            }
            k[i] = row;
        }
        return k;
    }

    /**
     * Calculate lower triangular matrix L such that LL^T = A
     * Using Cholesky decomposition from
     * https://rosettacode.org/wiki/Cholesky_decomposition#Java
     */
    private static double[][] cholesky(double[][] a) {
        final int m = a.length;
        double[][] l = new double[m][m];
        for(int i = 0; i< m;i++){
            for(int k = 0; k < (i+1); k++){
                double sum = 0;
                for(int j = 0; j < k; j++){
                    sum += l[i][j] * l[k][j];
                }
                l[i][k] = (i == k) ? Math.sqrt(a[i][i] - sum) : // Source of NaN at 32,32
                    (1.0 / l[k][k] * (a[i][k] - sum));
            }
        }
        return l;
    }
}

Upvotes: 1

Views: 1210

Answers (2)

DavidB2013
DavidB2013

Reputation: 67

I just finished writing my own version of a Cholesky Decomposition routine in C++ and JavaScript. Instead of computing L, it computes U, but I would be curious to test it with the matrix that causes the NaN error. Would you be able to post the matrix here, or contact me (info in Profile.)

Upvotes: 0

beldaz
beldaz

Reputation: 4471

Hmm, I think I've found an answer to my own question, from the same textbook I was following. From R&W p.201:

In practice it may be necessary to add a small multiple of the identity matrix $\epsilon I$ to the covariance matrix for numerical reasons. This is because the eigenvalues of the matrix K can decay very rapidly [...] and without this stabilization the Cholesky decomposition fails. The effect on the generated samples is to add additional independent noise of variance $epsilon$.

So the following change seems to be sufficient:

private static double[][] cholesky(double[][] a) {
    final int m = a.length;
    double epsilon = 0.000001; // Small extra noise value
    double[][] l = new double[m][m];
    for(int i = 0; i< m;i++){
        for(int k = 0; k < (i+1); k++){
            double sum = 0;
            for(int j = 0; j < k; j++){
                sum += l[i][j] * l[k][j];
            }
            l[i][k] = (i == k) ? Math.sqrt(a[i][i]+epsilon - sum) : // Add noise to diagonal values
                (1.0 / l[k][k] * (a[i][k] - sum));
        }
    }
    return l;
}

Upvotes: 2

Related Questions