Reputation: 3335
Suppose I have a numpy array:
data = np.array([[1,1,1],[2,2,2],[3,3,3]])
and I have a corresponding "vector:"
vector = np.array([1,2,3])
How do I operate on data
along each row to either subtract or divide so the result is:
sub_result = [[0,0,0], [0,0,0], [0,0,0]]
div_result = [[1,1,1], [1,1,1], [1,1,1]]
Long story short: How do I perform an operation on each row of a 2D array with a 1D array of scalars that correspond to each row?
Upvotes: 192
Views: 178762
Reputation: 41
The key is to reshape the vector of size (3,) to (3,1): divide each row by an element or (1,3): divide each column by an element. As data.shape does not correspond to vector.shape, NumPy automatically expands vector's shape to (3,3) and performs division, element-wise.
In[1]: data/vector.reshape(-1,1)
Out[1]:
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
In[2]: data/vector.reshape(1,-1)
Out[2]:
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
Similar:
x = np.arange(9).reshape(3,3)
x
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
x/np.sum(x, axis=0, keepdims=True)
array([[0. , 0.08333333, 0.13333333],
[0.33333333, 0.33333333, 0.33333333],
[0.66666667, 0.58333333, 0.53333333]])
x/np.sum(x, axis=1, keepdims=True)
array([[0. , 0.33333333, 0.66666667],
[0.25 , 0.33333333, 0.41666667],
[0.28571429, 0.33333333, 0.38095238]])
print(np.sum(x, axis=0).shape)
print(np.sum(x, axis=1).shape)
print(np.sum(x, axis=0, keepdims=True).shape)
print(np.sum(x, axis=1, keepdims=True).shape)
(3,)
(3,)
(1, 3)
(3, 1)
Upvotes: 3
Reputation: 2167
Pythonic way to do this is ...
np.divide(data.T,vector).T
This takes care of reshaping and also the results are in floating point format. In other answers results are in rounded integer format.
#NOTE: No of columns in both data and vector should match
Upvotes: 9
Reputation: 2191
Adding to the answer of stackoverflowuser2010, in the general case you can just use
data = np.array([[1,1,1],[2,2,2],[3,3,3]])
vector = np.array([1,2,3])
data / vector.reshape(-1,1)
This will turn your vector into a column matrix/vector
. Allowing you to do the elementwise operations as you wish. At least to me, this is the most intuitive way going about it and since (in most cases) numpy will just use a view of the same internal memory for the reshaping it's efficient too.
Upvotes: 10
Reputation: 40879
JoshAdel's solution uses np.newaxis to add a dimension. An alternative is to use reshape() to align the dimensions in preparation for broadcasting.
data = np.array([[1,1,1],[2,2,2],[3,3,3]])
vector = np.array([1,2,3])
data
# array([[1, 1, 1],
# [2, 2, 2],
# [3, 3, 3]])
vector
# array([1, 2, 3])
data.shape
# (3, 3)
vector.shape
# (3,)
data / vector.reshape((3,1))
# array([[1, 1, 1],
# [1, 1, 1],
# [1, 1, 1]])
Performing the reshape() allows the dimensions to line up for broadcasting:
data: 3 x 3
vector: 3
vector reshaped: 3 x 1
Note that data/vector
is ok, but it doesn't get you the answer that you want. It divides each column of array
(instead of each row) by each corresponding element of vector
. It's what you would get if you explicitly reshaped vector
to be 1x3
instead of 3x1
.
data / vector
# array([[1, 0, 0],
# [2, 1, 0],
# [3, 1, 1]])
data / vector.reshape((1,3))
# array([[1, 0, 0],
# [2, 1, 0],
# [3, 1, 1]])
Upvotes: 4
Reputation: 10690
As has been mentioned, slicing with None
or with np.newaxes
is a great way to do this.
Another alternative is to use transposes and broadcasting, as in
(data.T - vector).T
and
(data.T / vector).T
For higher dimensional arrays you may want to use the swapaxes
method of NumPy arrays or the NumPy rollaxis
function.
There really are a lot of ways to do this.
For a fuller explanation of broadcasting, see http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Upvotes: 19
Reputation: 68682
Here you go. You just need to use None
(or alternatively np.newaxis
) combined with broadcasting:
In [6]: data - vector[:,None]
Out[6]:
array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
In [7]: data / vector[:,None]
Out[7]:
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
Upvotes: 267