Leeren
Leeren

Reputation: 2081

How do you apply a function incorporating random numbers to rows of a numpy array in python?

So I have a 3D array with shape (28, 28, 60000), corresponding to 60000 28x28 images. I want to get random 24x24 chunks of each image by using the following function:

def crop(X):
    x = random.randint(0,3)
    y = random.randint(0,3)
    return X[x:24+x, y:24+y,]

If I apply the function crop(X) to my matrix X, however, the same chunk from each sample is returned. How do I ensure each sample uses different randomly generated x and y values?

Upvotes: 3

Views: 283

Answers (2)

Divakar
Divakar

Reputation: 221584

Here's a vectorized generic ( to handle non-squarish arrays as well) approach using NumPy broadcasting and linear indexing that generates the slices across all the images in one-go to produce a 3D array output, like so -

# Store shape
m,n,N = A.shape # A is the input array

# Set output block shape
out_blk_shape = (24,24)

x = np.random.randint(0,m-out_blk_shape[0]-1,(N))
y = np.random.randint(0,n-out_blk_shape[1]-1,(N))

# Get range arrays for the block across all images
R0 = np.arange(out_blk_shape[0])
R1 = np.arange(out_blk_shape[1])

# Get offset and thus all linear indices. Finally index into input array.
offset_idx = (y*n*N + x*N) + np.arange(N)
all_idx = R0[:,None]*n*N + R1*N + offset_idx[:,None,None]
out = A.ravel()[all_idx]

Sample run -

1) Inputs :

In [188]: A = np.random.randint(0,255,(6,7,2)) # Input array

In [189]: # Set output block shape
     ...: out_blk_shape = (3,2) # For demo reduced to a small shape
          # Rest of the code stays the same.

In [190]: x  # To select the start columns from the slice
Out[190]: array([1, 0])

In [191]: y  # To select the start rows from the slice
Out[191]: array([1, 2])

In [192]: A[:,:,0]
Out[192]: 
array([[ 75, 160, 110,  29,  77, 198,  78],
       [237,  39, 219, 184,  73, 149, 144],
       [138, 148, 243, 160, 165, 125,  17],
       [155, 157, 110, 175,  91, 216,  61],
       [101,   5, 209,  98, 212,  44,  63],
       [213, 155,  96, 160, 193, 185, 157]])

In [193]: A[:,:,1]
Out[193]: 
array([[201, 223,   7, 140,  98,  41, 167],
       [139, 247, 134,  17,  74, 216,   0],
       [ 44,  28,  26, 182,  45,  24,  34],
       [178,  29, 233, 146, 157, 230, 173],
       [111, 220, 234,   6, 246, 218, 149],
       [200, 101,  23, 116, 166, 199, 233]])

2) Output :

In [194]: out
Out[194]: 
array([[[ 39, 219],
        [148, 243],
        [157, 110]],

       [[ 44,  28],
        [178,  29],
        [111, 220]]])

Upvotes: 0

chappers
chappers

Reputation: 2415

Here is my attempt at it.

Basically the idea is you will have to somehow split the matrix away from the last dimension (numpy doesn't let you apply over things which aren't a 1d array). You can do this using dsplit, and put it back together using dstack.

Then you would apply your crop function over each component. As a simplified example:

import random

a = np.array(range(300)).reshape(10,10,3)

def crop(X):
    x = random.randint(0,3)
    y = random.randint(0,3)
    return X[x:3+x, y:3+y]

# we can loop over each component of the matrix by first splitting it
# off the last dimension:
b = [np.squeeze(x) for x in np.dsplit(a, a.shape[-1])]

# this will recreate the original matrix
c = np.dstack(b)

# so putting it together with the crop function
get_rand_matrix = [crop(np.squeeze(x)) for x in np.dsplit(a, a.shape[-1])]
desired_result = np.dstack(get_rand_matrix)

Upvotes: 1

Related Questions