Jimmy C
Jimmy C

Reputation: 9680

How can I create a matrix based on smaller matrices?

I'm trying to create a matrix based off of 1xN matrices in a fast an efficient way, for later being used as features in scikit-learn training. One of many things I've been trying so far is:

np.matrix(list(func(text) for text in data_test.data))

Which creates a matrix of matrices, like this:

matrix([[ <1x188796 sparse matrix of type '<type 'numpy.float64'>'
    with 10921 stored elements in Compressed Sparse Row format>,
         <1x188796 sparse matrix of type '<type 'numpy.float64'>'
    with 17651 stored elements in Compressed Sparse Row format>,
         <1x188796 sparse matrix of type '<type 'numpy.float64'>'
    with 28180 stored elements in Compressed Sparse Row format>,...

Which isn't really what I'm looking for, obviously. How can I make this into a more proper matrix, as such:

<76002x108800 sparse matrix of type '<type 'numpy.float64'>'
with 807960 stored elements in Compressed Sparse Row format>

Upvotes: 0

Views: 89

Answers (1)

pv.
pv.

Reputation: 35125

How about http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.sparse.vstack.html

If that's too slow, take the fast path from here: https://github.com/scipy/scipy/blob/master/scipy/sparse/construct.py#L396 (in future Scipy versions, vstack itself will be fast in this case).

Upvotes: 2

Related Questions