Tomsen1410
Tomsen1410

Reputation: 31

Exponential moving covariance matrix

I have time series data of a specific dimensionality (e.g. [T, 32]). I filter the data in an online fashion using an exponential moving average and variance (according to Wikipedia and this paper):

mean_n = alpha * mean_{n-1} + (1-alpha) * sample
var_n = alpha * (var_{n-1} + (1-alpha) * (sample - mean_{n-1}) * (sample - mean_{n-1}))

I wanted to replace the moving variance with a moving covariance matrix in order to capture the correlation between the data dimensions (e.g. 32). So I have simply replaced the element-wise variance calculation with an outer product:

covmat_n = alpha * (covmat_{n-1} + (1-alpha) * np.outer((sample - mean_{n-1}), (sample - mean_{n-1})))

However this does not seem to give correct results. For example, when I try to initialize a pytorch multivariate gaussian distribution with such a covariance matrix, it sometimes throws an error saying that the covariance matrix contains invalid values. Since it is always symmetric I assume it breaks the positive-definiteness constraint. Other downstream tasks suggested the same (e.g. computing the KL-Divergence between two gaussians with such covariance matrices sometimes gave negative results).

Does anyone know what I am doing wrong here? Where is the error in my calculations? And as a bonus question: Are my calculations for the simple moving variance correct? It seeems strange to multiply the new sample variance with alpha again, but the sources suggest that it is the correct way.

Upvotes: 0

Views: 729

Answers (1)

Tomsen1410
Tomsen1410

Reputation: 31

I have found the answer myself. It seemed to be a numerical problem. Since the eigenvalues of a positive definite matrix must be positive, I could solve it by applying an eigenvalue decomposition to every sample's covariance matrix and ensure that its eigenvalues are larger than zero:

diff = sample - last_mean
sample_covmat = np.outer(diff, diff)
w, v = np.linalg.eigh(sample_covmat)
w += 1e-3 # Note: Could also achieve this by e.g. w = np.maximum(w, 0)
sample_covmat = v @ np.diag(w) @ np.linalg.inv(v)

Upvotes: 1

Related Questions