Soroush
Soroush

Reputation: 260

Getting negative S value from SVD decomposition in Numpy?

I want to whiten the CIFAR10 dataset using ZCA. The input X_train is of shape (40000, 32, 32, 3) where 40000 is the number of images, and 32x32x3 is the size of each image. I'm using the code from this answer for this purpose:

X_flat = np.reshape(X_train, (-1, 32*32*3))
# compute the covariance of the image data
cov = np.cov(X_flat, rowvar=True)   # cov is (N, N)
# singular value decomposition
U,S,V = np.linalg.svd(cov)     # U is (N, N), S is (N,)
# build the ZCA matrix
epsilon = 1e-5
zca_matrix = np.dot(U, np.dot(np.diag(1.0/np.sqrt(S + epsilon)), U.T))
# transform the image data       zca_matrix is (N,N)
zca = np.dot(zca_matrix, X_flat)    # zca is (N, 3072)

However, at run time I encountered the following warning:

D:\toolkits.win\anaconda3-5.2.0\envs\dlwin36\lib\site- packages\ipykernel_launcher.py:8: RuntimeWarning: invalid value encountered in sqrt

So after I got the SVD output, I tried:

print(np.min(S)) # prints -1.7798217

Which is unexpected because S can only have positive values. Also, the ZCA whitening result was not correct and it contained nan values.

I tried reproducing this by re-running this same code a second time and this time I did not encounter any warnings or any negative S values, but instead I got:

print(np.min(S)) # prints nan

Any idea for why this might have happened?


Update: Restarted the kernel to free up cpu and RAM resources, and tried running this code again. Again got the same warning for feeding in negative values to np.sqrt(). Not sure if it helps but I've also attached the cpu and ram utilization figures:

activity monitor figures

Upvotes: 4

Views: 1377

Answers (1)

Ahmed Fasih
Ahmed Fasih

Reputation: 6927

Here are a couple of ideas. I don't have your dataset so I can't be totally sure that these will fix your problem, but I'm confident enough to post this as an answer instead of a comment.

First. Your X_train is 40'000 by 3072, where each row is a data vector, and each column is a variable or feature. You want the covariance matrix that is 3072 by 3072: pass in rowvar=False to np.cov.

I'm not really sure why the 40'000 by 40'000 covariance matrix's SVD is diverging. Assuming you have enough RAM to store the 12 GB covariance matrix, the one thing I can think of is numerical overflow, because you're perhaps not removing the mean of the data, as is expected by ZCA (and any other whitening technique)?

So second. Remove the mean: X_zeromean = X_flat - np.mean(X_flat, 0).

If you do these, then the final step has to be modified a tiny bit (to make dimensions line up). Here's a quick check using uniform random data:

import numpy as np
X_flat = np.random.rand(40000, 32*32*3)
X_zeromean = X_flat - np.mean(X_flat, 0)
cov = np.cov(X_zeromean, rowvar=False)
U,S,V = np.linalg.svd(cov)
epsilon = 1e-5
zca_matrix = np.dot(U, np.dot(np.diag(1.0/np.sqrt(S + epsilon)), U.T))
zca = np.dot(zca_matrix, X_zeromean.T) # <-- transpose needed here

As a sanity check np.cov(zca) now is very close to the identity matrix, as desired (zca will have flipped dimensions as the input).

(As a sidenote, this is a really expensive and numerically unstable way to whiten the data array: you don't need to compute the covariance and then take the SVD—you're doing twice the work. You can take the skinny SVD of the data matrix itself (np.linalg.svd with the full_matrices=False flag) and compute the whitening matrix directly from there, without ever evaluating the expensive outer product for the covariance matrix.)

Upvotes: 2

Related Questions