daydreamer
daydreamer

Reputation: 92019

Numpy/Scipy: How to re-construct an ndarray?

I am working on a classification problem.
I have a ndarray of shape (604329, 33) where there are 32 features and one column for label:

>>> n_data.shape   
(604329, 33)

The third column of this ndarray is a label with 0 and 1.
I need to move this third column as the last column so that it is easier to work with when slicing is needed.

Question:
Is there a way to reconstruct the ndarray where we can move this third column as the last column?

Upvotes: 2

Views: 167

Answers (3)

senderle
senderle

Reputation: 151027

As an alternative to aix's solution, you could slice the array directly, without hstack.

>>> a = numpy.array([range(33) for _ in range(4)])
>>> indices = range(33)
>>> indices.append(indices.pop(3))
>>> a[:,indices]
array([[ 0,  1,  2,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,  3],
       [ 0,  1,  2,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,  3],
       [ 0,  1,  2,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,  3],
       [ 0,  1,  2,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,  3]])

It's a bit faster for small arrays:

>>> %timeit numpy.hstack((a[:,:3], a[:,4:], a[:, 3:4]))
100000 loops, best of 3: 19.1 us per loop
>>> %timeit indices = range(33); indices.append(indices.pop(3)); a[:,indices]
100000 loops, best of 3: 14 us per loop

But actually, for larger arrays, it's slower.

>>> a = numpy.array([range(33) for _ in range(600000)])
>>> %timeit numpy.hstack((a[:,:3], a[:,4:], a[:, 3:4]))
1 loops, best of 3: 385 ms per loop
>>> %timeit indices = range(33); indices.append(indices.pop(3)); a[:,indices]
1 loops, best of 3: 670 ms per loop

If you don't need to preserve the order of the columns, (i.e. if you can use roll) then Mr. E's solution is fastest for large a:

>>> %timeit numpy.roll(a, -3, axis=1)
10 loops, best of 3: 120 ms per loop

Upvotes: 2

YXD
YXD

Reputation: 32521

If I understand correctly, you want to do:

my_array = numpy.roll(my_array,-3,axis=1)

Upvotes: 2

NPE
NPE

Reputation: 500495

The following will do it:

x = np.hstack((x[:,:3],x[:,4:],x[:,3:4]))

where x is your ndarray.

Upvotes: 2

Related Questions