Reputation: 1531
I have an array of 2-dimensional arrays in python, all of which are 20 rows, but have variable column numbers (between 80 and 90 each).
I would like to iteratively add the numerical (float) values within these two dimensional arrays to create one final two-dimensional array (see my schematic below).
I'm new to python/numpy library and have found a few functions that might be what I'm looking for but can't seem to get them to work.
CONCATENATE docs here Says that it will add two matrices of different size (supposing one axis is identical in all, I think?) but I don't know how to actually do the addition step iteratively. Reason being is because I want to initialize an empty numpy array outside the scope of my loop so I can add to it over and over and have the values save, but when I try to initialize an array with:
my_final_matrix = np.array()
It throws an error that no array is defined within the constructor.
FLATTEN/RESHAPE docs here Reduce dimensionality but don't add the values.
In short -- how do I iteratively add different-sized matrices in numpy?
Upvotes: 2
Views: 1506
Reputation: 53089
Here is a solution that autodetects the required output size:
>>> import numpy as np
>>>
# create ragged list
>>> n = 4
>>> ragged = list(map(np.full, np.random.randint(1, 6, (n, 2)), 10**np.arange(n)))
>>>
>>> ragged
[array([[1, 1, 1]]), array([[10, 10],
[10, 10],
[10, 10]]), array([[100],
[100],
[100]]), array([[1000, 1000, 1000, 1000],
[1000, 1000, 1000, 1000]])]
>>>
# find maximum size in each dimension
>>> maxsh = *map(max, zip(*map(np.shape, ragged))),
# allocate result
>>> result = np.zeros(maxsh, dtype=ragged[0].dtype)
# and add
>>> for r in ragged:
... result[(*map(slice, r.shape),)] += r
...
>>> result
array([[1111, 1011, 1001, 1000],
[1110, 1010, 1000, 1000],
[ 110, 10, 0, 0]])
Upvotes: 1
Reputation: 11
I don't know of a single numpy function that can do this, but if a for loop is acceptable you could do this:
array_sum = np.zeros((20, 90))
for array in arrays:
array_sum[:, :array.shape[1]] += array
Upvotes: 1
Reputation: 1275
If you know the max number of columns then just use a bit of memory overhead (it's not that much in your small matrix scenario) and initialize all your matrices to (20 x max(90?)). numpy/scipy work best (i.e. fastest and most consistently) when you don't dynamically mess with matrices.
Alternatively, more in line with your original question (but much less efficient), you could reshape your smallest matrix to the largest matrix (options for zero-padding or whatever) as you encounter larger and larger matrices.
Upvotes: 1