Sparse random matrix in Python with different range than [0,1]

Question

I need to generate a sparse random matrix in Python with all values in the range [-1,1] with uniform distribution. What is the most efficient way to do this?

I have a basic sparse random matrix:

from scipy import sparse
from numpy.random import RandomState

p = sparse.rand(10, 10, 0.1, random_state=RandomState(1))

And this gives me values in [0,1]:

print p
  (0, 0)    0.419194514403
  (0, 3)    0.0273875931979
  (1, 4)    0.558689828446
  (2, 7)    0.198101489085
  (3, 5)    0.140386938595
  (4, 1)    0.204452249732
  (4, 3)    0.670467510178
  (8, 1)    0.878117436391
  (9, 0)    0.685219500397
  (9, 3)    0.417304802367

It would be good to have an in-place solution or something that doesn't require blowing it up to a full matrix since in practice I will be using very large dimensions. It surprises me there are not some quick parameters to set for sparse.rand itself.

Eric Appelt · Accepted Answer

Looks like the feature that you want was added about two months ago and will be available in scipy 0.16: https://github.com/scipy/scipy/blob/77af8f44bef43a67cb14c247bc230282022ed0c2/scipy/sparse/construct.py#L671

You will be able to call sparse.random(10, 10, 0.1, random_state=RandomState(1), data_fvs=func) where func "should take a single argument specifying the length of the ndarray that it will return. The structurally nonzero entries of the sparse random matrix will be taken from the array sampled by this function." So you will be able to provide an arbitrary distribution to sample from.

For now, you can at least stretch the uniform distribution to [0,N] by multiplying p by a scalar N:

>>> print 2*p

(0, 0)  0.838389028807
(9, 0)  1.37043900079
(4, 1)  0.408904499463
(8, 1)  1.75623487278
(0, 3)  0.0547751863959
(4, 3)  1.34093502036
(9, 3)  0.834609604734
(1, 4)  1.11737965689
(3, 5)  0.28077387719
(2, 7)  0.39620297817

You can't add scalars, but as a bit of a hack you can create a sparse matrix with all ones in the non-zero elements with p.ceil() since all elements of p were generated within [0,1]. Then to transform the uniform distribution to [-1,1] you can do

 print 2*p - p.ceil()

(0, 0)  -0.161610971193
(0, 3)  -0.945224813604
(1, 4)  0.117379656892
(2, 7)  -0.60379702183
(3, 5)  -0.71922612281
(4, 1)  -0.591095500537
(4, 3)  0.340935020357
(8, 1)  0.756234872782
(9, 0)  0.370439000794
(9, 3)  -0.165390395266

So in general if you need some interval [a,b] just perform:

p = (b - a)*p + a*p.ceil()

I can't see much of a better solution at present short of writing your own constructor similar to sparse.rand, but I would be curious to know if anyone at least knows a way to get around the ceil() hack.

Sparse random matrix in Python with different range than [0,1]

Answers (2)

Related Questions