farhawa
farhawa

Reputation: 10417

change the format of a numpy array with no loops

I have a numpy array with shape a.shape = (1,k*d)and i want to transform it to a numpy array with shape b.shape = (k*d,k) in each column

b[i,j] = a[i] if j<i+1

b[i,j] = 0 if not

for example:

k = 3
d= 2
**********

A =  |a|   =>  B =  |a 0 0|
     |b|            |b 0 0|
     |c|            |0 c 0|
     |d|            |0 d 0|
     |e|            |0 0 e|
     |f|            |0 0 f|

mainly, with no loops!

What I am looking for is a sequence of numpy-matrix operations that lead to the desired result.

Upvotes: 2

Views: 194

Answers (2)

Divakar
Divakar

Reputation: 221714

Here's an efficient approach based on zeros padding to the input array. The inlined comments at each code step must make it more clear on how it achieves the desired output. Here's the code -

# Arrange groups of d number of elements from the input array into 
# rows of a 2D array and pad with k*d zeros in each row. 
# Thus, the shape of this 2D array would be (k,d+k*d)
A_zeroappend = np.zeros((k,(k+1)*d))
A_zeroappend[:,:d] = A.reshape(-1,d)

# Get rid of the last row of appended zeros.
# Reshape and transpose to desired output shape (k*d,k) 
out = A_zeroappend.ravel()[:k*k*d].reshape(-1,k*d).T

Runtime test

Here's a quick runtime test comparing the proposed approach and the np.repeat based approach listed in the other answer -

In [292]: k = 800
     ...: d = 800
     ...: A = np.random.randint(2,9,(1,k*d))
     ...: 

In [293]: %%timeit
     ...: B = np.zeros((k*d,k))
     ...: B[np.arange(k*d),np.arange(k).repeat(d)]=A
     ...: 
1 loops, best of 3: 342 ms per loop

In [294]: %%timeit
     ...: A_zeroappend = np.zeros((k,(k+1)*d))
     ...: A_zeroappend[:,:d] = A.reshape(-1,d)
     ...: out = A_zeroappend.ravel()[:k*k*d].reshape(-1,k*d).T
     ...: 
100 loops, best of 3: 3.07 ms per loop

Seems like the proposed approach is whoopingly fast!

Upvotes: 1

hpaulj
hpaulj

Reputation: 231665

This reproduces your example. It can be generalized to other k and d

In [12]: a=np.arange(6)    
In [13]: b=np.zeros((6,3))
In [14]: b[np.arange(6),np.arange(3).repeat(2)]=a

In [15]: b
Out[15]: 
array([[ 0.,  0.,  0.],
       [ 1.,  0.,  0.],
       [ 0.,  2.,  0.],
       [ 0.,  3.,  0.],
       [ 0.,  0.,  4.],
       [ 0.,  0.,  5.]])

The key is the column indexing that repeats the necessary number of times

In [16]: np.arange(3).repeat(2)
Out[16]: array([0, 0, 1, 1, 2, 2])

Upvotes: 4

Related Questions