usernumber
usernumber

Reputation: 2186

Gaussian KDE of n-dimensional data : leading minor of the array is not positive definite

I have two subsets of n-dimensional data A and B, and I would like to know, for each sample in B the density of samples from A around it.

Example datasets with 5 samples in 3 dimensions

A = np.array([[-2.44528668, -0.09326276, -1.06527892],
       [-1.35144799, -1.45507518, -0.02096   ],
       [-0.5788315 , -1.48932706, -0.28496559],
       [-1.60224949, -0.76823424, -0.11548589],
       [-1.15768561, -0.74704022, -0.14744463]])

B = np.array([[-1.84134663, -1.42036525, -1.38819347],
       [-2.58165693, -2.49423057, -1.57609454],
       [-0.78776371, -0.79168188,  0.21967791],
       [-1.0165618 , -1.78509185, -0.68373997],
       [-1.21764947, -0.43215885, -0.34393573]])

I tried to do the following

from scipy.stats import gaussian_kde

kernel = gaussian_kde(A)
densities = kernel(B)

but this raised

LinAlgError: 2-th leading minor of the array is not positive definite

What does this error mean, and how can I get the density of points from A for each sample in B?

Upvotes: 3

Views: 2632

Answers (1)

usernumber
usernumber

Reputation: 2186

Based on the error message I get when I do

kernel = gaussian_kde(A)
densities = kernel(B[0])

I figured that gaussian_kde considers each column to be one sample, and each line to be the coordinates in the nth dimension, so I should be using the transpose of my arrays instead.

So to get the result I want, I should be doing

kernel = gaussian_kde(A.T)
densities = kernel(B.T)

But I still don't know what the error message I was getting means.

Upvotes: 3

Related Questions