Jey
Jey

Reputation: 600

clearing elements of numpy array

Is there a simple way to clear all elements of a numpy array? I tried:

del arrayname

This removes the array completely. I am using this array inside a for loop that iterates thousands of times, so I prefer to keep the array but populate it with new elements every time.

I tried numpy.delete, but for my requirement I don't see the use of subarray specification.

*Edited*:

The array size is not going to be the same.

I allocate the space, inside the loop at the beginning, as follows. Please correct me if this is a wrong way to go about:

arrname = arange(x*6).reshape(x,6)

I read a dataset and construct this array for each tuple in the dataset. All I know is the number of columns is going to be the same but not the number of rows. For example, the first time I might need an array of size (3,6), for the next tuple as (1,6) and the next time as (4,6) and so on. The way I populate the array is as follows:

arrname[:,0] = lstname1
arrname[:,1] = lstname2
...

In other words, the columns are filled from lists constructed from the tuples. So, before the next loop begins I want to clear its elements and make it ready for the consecutive loop since I don't want remnants from the previous loop mixing the current contents.

Upvotes: 4

Views: 55599

Answers (4)

Phil Cooper
Phil Cooper

Reputation: 5877

With a wag of the finger for possible premature optimization, I will offer some thoughts:

You say you don't want any remnants left over from previous iterations. From your code it looks like you populate each of the new elements column by column for each of the known number of columns. "Left over" values don't look like a problem. consider:

  • using arange and reshape serves no purpose. use np.empty((n,6)). Faster than ones or zeros by a hair.

  • you could alternatively construct your new array from the constituents

See:

lstname1 = np.arange(3)
lstname2 = 22*np.arange(3)
np.vstack((lstname1,lstname2)).T
# returns
array([[ 0,  0],
       [ 1, 22],
       [ 2, 44]])
#or
np.hstack((lstname1[:,np.newaxis],lstname2[:,np.newaxis]))
array([[ 0,  0],
       [ 1, 22],
       [ 2, 44]])

Lastly, If you are really really concerned about speed, you could allocate the largest expected size (if not known the you could check the requested size vs the last largest and if it is larger then use np.empty((rows,cols)) to increase the size.

Then at each iteration, your create a view of the larger matrix of just the number of rows you want. This will cause numpy to reuse the same buffer space and not need to to any allocation at each of your iterations. Notice:

In [36]: big = np.vstack((lstname1,lstname2)).T

In [37]: smaller = big[:2]

In [38]: smaller[:,1]=33

In [39]: smaller
Out[39]: 
array([[ 0, 33],
       [ 1, 33]])
In [40]: big
Out[40]: 
array([[ 0, 33],
       [ 1, 33],
       [ 2, 44]])

Note These are suggestions that fit your expanded question with clarification and does not fit your earlier question about "clearing" the array. Even in the latter example you could easily say smaller.fill(0) to allay concerns depending on whether you reliably reassign all elements of the array in your iterations.

Upvotes: 3

Bi Rico
Bi Rico

Reputation: 25833

I'm not sure what you mean by clear, the array will always have some values stored in it, but you can set those values to something, for example:

>>> A = numpy.array([[1, 2], [3, 4], [5, 6]], dtype=numpy.float)
>>> A
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.]])

>>> A.fill(0)
>>> A
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])

>>> A[:] = 1.
>>> A
array([[ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]])

Update

First, your question is very unclear. The more effort you put into writing a good question the better answers you'll get. A good question should make it clear to us what you're trying to do and why. Also example data is very helpful, just a small amount, so we can see exactly what you're trying to do.

That being said. It seems like maybe you should just create a new array for each iteration. Creating arrays is pretty fast and it's not clear why you would want to reuse an array when the size and contents need to change. If you're trying to reuse it for performance reasons, you're probably not going to see any measurable difference, resizing arrays is not noticeably faster than creating a new array. You can create a new array by calling numpy.zeros((X, 6))

Also in your question you say:

the columns are filled from lists constructed from the tuples

If your data is already housed as a list of tuples you use numpy.array to convert it to an array. You don't need to go the the trouble of creating an array and filling it. For example if I wanted to get a (2, 3) array from a list of tuples I would do:

data = [(0, 0, 1), (0, 0, 2)]
A = numpy.array(data)

# or if the data is stored like this
data = [(0, 0), (0, 0), (1, 2)]
A = numpy.array(data).T

Hope that helps.

Upvotes: 5

Charles Brunet
Charles Brunet

Reputation: 23160

If you want to keep the array allocated, and with the same size, you don't need to clear the elements. Simply keep track of where you are, and overwrite the values in the array. This is the most efficient way of doing it.

Upvotes: 2

Prashant Kumar
Prashant Kumar

Reputation: 22649

I would simply begin putting the new values into the array.

But if you insist on clearing out the array, try making a new one of the same size using zeros or empty.

>>> A = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> A
array([[1, 2],
       [3, 4],
       [5, 6]])

>>> A = numpy.zeros(A.shape)
>>> A
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])

Upvotes: 0

Related Questions