Reputation: 407
I am using this function X = randsrc(250,600,[[-1,0,1];[0.5/ps,1-1/ps,0.5/ps]])) with ps=2373 It shows that 250*600 matrix is generated. Its entries only contain -1,0 or 1. And -1,0,1 is randomly choosed according to the probability distribution 0.5/ps,1-1/ps,0.5/ps.
So that the density is about 0.00042.
The above X is called sparse random projection matrix, see https://web.stanford.edu/~hastie/Papers/Ping/KDD06_rp.pdf. It can be used to compress a data vector from dimension 600 to 250 with some nice geometric properties guaranteed.
The problem is that in Matlab, randsrc seems to be very slow (e.g., compared with randn(250,600)). Then, how can I fast generate the above matrix?
BTW, how can I fast calculate X*y? where y may be a dense vector.
My code is:
ps=2373;
tic;
X = randsrc(250,600,[[-1,0,1];[0.5/ps,1-1/ps,0.5/ps]]));
toc
a = randn(600,1);
tic;
X*a;
toc
Also, I have tried a same Python function http://scikit-learn.org/stable/modules/generated/sklearn.random_projection.SparseRandomProjection.html, it is twice faster than Matlab.
Upvotes: 1
Views: 621
Reputation: 474
You can use sprand to generate a sparsity structure, then find to extract the rows and columns of the non-zero elements. Finally randsample will select values -1,1 with 50% probability of each:
ps=2373;
tic
[i,j,~] = find(sprand(250,600,1/ps))
X = sparse(i,j,randsample([-1,1],length(i),true))
toc
MATLAB is very fast at multiplying matrices so X*a is very fast.
Upvotes: 0