lipponen
lipponen

Reputation: 753

Proper way to create big Numpy arrays in loop?

I have function (getArray) that returns Numpy array of with size (1, 40), lets say it returns:

[-0.385 -0.385 -0.405 -0.455 -0.485 -0.485 -0.425 -0.33  -0.22  -0.07   0.12   0.375  0.62   0.78   0.84   0.765  0.52   0.17  -0.165 -0.365 -0.435 -0.425 -0.37  -0.33  -0.325 -0.335 -0.345 -0.33  -0.325 -0.315 -0.31  -0.32 -0.335 -0.34  -0.325 -0.345 -0.335 -0.33  -0.335 -0.33 ]

Then in the loop I need to create Numpy array containing multiple arrays returned by getArray-function and size of the array can be for example (2000, 40). What is the proper way of doing this?

If I create Numpy array in the loop, I need to create new array in every iteration, which is not what I want. Now I have first created list of Numpy arrays and then created array from the list. It works nicely until the amount of rows exceeds 255. After that array changes from 2D to 1D.

I have also tried to add rows to array using vstack-function. When the final array is sized (255, 40), this works nicely. Here is the code I have used:

A = numpy.empty((0,40), float)
for value in values:
    meas = getArray(value)
    A = numpy.vstack([A, meas])
print(A.shape)
print(A)

If there are maximum 255 rows, I got following result

(255, 40)
[[-0.385 -0.385 -0.405 ..., -0.33  -0.335 -0.33 ] [-0.425 -0.445 -0.475 ..., -0.375 -0.395 -0.41 ] [-0.41  -0.435 -0.465 ..., -0.4   -0.4   -0.415] ...,  [-0.47  -0.495 -0.495 ..., -0.425 -0.425 -0.43 ] [-0.5   -0.52  -0.57  ..., -0.455 -0.445 -0.435] [-0.515 -0.57  -0.62  ..., -0.39  -0.41  -0.385]]

When there are more than 255 rows, I got following error

ValueError: all the input array dimensions except for the concatenation axis must match exactly

Edit:

The following works:

array = numpy.empty((size,total_window_size))
for index, value in enumerate(values):
    meas = getArray(value)
    if meas.size == total_window_size:
        array[index] = meas

Upvotes: 1

Views: 925

Answers (1)

zimmerrol
zimmerrol

Reputation: 4951

If you know the number of iterations of the loop (for example in a for loop) you could initialize the array before the loop with an appropriate size like this:

result = np.empty((nbIterations, 40))
for i in range(nbIterations):
    result[i] = getArray(parameters)

Is getArray always returning the same values? If this is the case, you could also use

return = np.tile(getArray(), nbIterations).reshape((nbIterations, -1))

to create your array.

In general it is not a good way to use lists for this kind of things because they tend to be slow: Whenever you append an item to the list, the structure of the list in the RAM has to be modified, which takes a long time for long lists. Therefore, you should rather use numpy arrays-

Upvotes: 1

Related Questions