Reputation: 39900
I have a coo_matrix
a
with shape (40106, 2048)
and a column numpy array b
with shape (40106,)
.
What I want to do is to simply concatenate the matrix and the array (i.e. the resulting data structure will have shape (40106, 2049)
).
I've tried to use hstack
as shown below
concat = hstack([a, b])
but I get the following error:
File "/Users/usr/anaconda/lib/python3.5/site-packages/scipy/sparse/construct.py", line 464, in hstack
return bmat([blocks], format=format, dtype=dtype)
File "/Users/usr/anaconda/lib/python3.5/site-packages/scipy/sparse/construct.py", line 581, in bmat
'row dimensions' % i)
ValueError: blocks[0,:] has incompatible row dimensions
I don't quite get why the dimensions do not match since both a
and b
have the same number of rows.
Upvotes: 2
Views: 1053
Reputation: 231550
I assume that's sparse.hstack
. Your b
when converted to a matrix will be (1,40106)
. Try turning it into a correct sparse matrix before passing it to hstack
. hstack
passes the job to bmat
, which ends up joining the coo
attributes of all the input matrices, thus making a new matrix
In [66]: from scipy import sparse
In [67]: A = sparse.coo_matrix(np.eye(3))
In [68]: b = np.ones(3)
In [69]: sparse.hstack((A,b))
....
ValueError: blocks[0,:] has incompatible row dimensions
In [70]: B=sparse.coo_matrix(b)
In [71]: B
Out[71]:
<1x3 sparse matrix of type '<class 'numpy.float64'>'
with 3 stored elements in COOrdinate format>
In [72]: sparse.hstack((A,B.T))
Out[72]:
<3x4 sparse matrix of type '<class 'numpy.float64'>'
with 6 stored elements in COOrdinate format>
In [73]: _.A
Out[73]:
array([[ 1., 0., 0., 1.],
[ 0., 1., 0., 1.],
[ 0., 0., 1., 1.]])
this also works (as in Divakar's answer):
In [74]: sparse.hstack((A,b[:,None]))
Out[74]:
<3x4 sparse matrix of type '<class 'numpy.float64'>'
with 6 stored elements in COOrdinate format>
My hastack
does:
return bmat([blocks], format=format, dtype=dtype)
So a direct call bmat also works
In [93]: sparse.bmat([[A, B.T]])
Out[93]:
<3x4 sparse matrix of type '<class 'numpy.float64'>'
with 6 stored elements in COOrdinate format>
sparse.bmat([A, B.T])
produces your blocks must be 2d
error.
Upvotes: 1
Reputation: 221634
Convert the second array, which is 1D
to 2D
and use then hstack
-
hstack([A,B[:,None]])
Sample run -
In [86]: from scipy.sparse import coo_matrix, hstack
# Sample inputs as a coo_matrix and an array
In [87]: A = coo_matrix([[1, 2, 0], [3, 0, 4]])
...: B = np.array([5, 6])
...:
# Use proposed solution
In [88]: out = hstack([A,B[:,None]])
# Print the dense version to visually verify
In [89]: out.toarray()
Out[89]:
array([[1, 2, 0, 5],
[3, 0, 4, 6]])
Upvotes: 1