Reputation: 2081
So I have a 3D array with shape (28, 28, 60000)
, corresponding to 60000 28x28 images. I want to get random 24x24 chunks of each image by using the following function:
def crop(X):
x = random.randint(0,3)
y = random.randint(0,3)
return X[x:24+x, y:24+y,]
If I apply the function crop(X)
to my matrix X
, however, the same chunk from each sample is returned. How do I ensure each sample uses different randomly generated x
and y
values?
Upvotes: 3
Views: 283
Reputation: 221584
Here's a vectorized generic ( to handle non-squarish arrays as well) approach using NumPy broadcasting
and linear indexing
that generates the slices across all the images in one-go to produce a 3D
array output, like so -
# Store shape
m,n,N = A.shape # A is the input array
# Set output block shape
out_blk_shape = (24,24)
x = np.random.randint(0,m-out_blk_shape[0]-1,(N))
y = np.random.randint(0,n-out_blk_shape[1]-1,(N))
# Get range arrays for the block across all images
R0 = np.arange(out_blk_shape[0])
R1 = np.arange(out_blk_shape[1])
# Get offset and thus all linear indices. Finally index into input array.
offset_idx = (y*n*N + x*N) + np.arange(N)
all_idx = R0[:,None]*n*N + R1*N + offset_idx[:,None,None]
out = A.ravel()[all_idx]
Sample run -
1) Inputs :
In [188]: A = np.random.randint(0,255,(6,7,2)) # Input array
In [189]: # Set output block shape
...: out_blk_shape = (3,2) # For demo reduced to a small shape
# Rest of the code stays the same.
In [190]: x # To select the start columns from the slice
Out[190]: array([1, 0])
In [191]: y # To select the start rows from the slice
Out[191]: array([1, 2])
In [192]: A[:,:,0]
Out[192]:
array([[ 75, 160, 110, 29, 77, 198, 78],
[237, 39, 219, 184, 73, 149, 144],
[138, 148, 243, 160, 165, 125, 17],
[155, 157, 110, 175, 91, 216, 61],
[101, 5, 209, 98, 212, 44, 63],
[213, 155, 96, 160, 193, 185, 157]])
In [193]: A[:,:,1]
Out[193]:
array([[201, 223, 7, 140, 98, 41, 167],
[139, 247, 134, 17, 74, 216, 0],
[ 44, 28, 26, 182, 45, 24, 34],
[178, 29, 233, 146, 157, 230, 173],
[111, 220, 234, 6, 246, 218, 149],
[200, 101, 23, 116, 166, 199, 233]])
2) Output :
In [194]: out
Out[194]:
array([[[ 39, 219],
[148, 243],
[157, 110]],
[[ 44, 28],
[178, 29],
[111, 220]]])
Upvotes: 0
Reputation: 2415
Here is my attempt at it.
Basically the idea is you will have to somehow split the matrix away from the last dimension (numpy doesn't let you apply over things which aren't a 1d array). You can do this using dsplit
, and put it back together using dstack
.
Then you would apply your crop function over each component. As a simplified example:
import random
a = np.array(range(300)).reshape(10,10,3)
def crop(X):
x = random.randint(0,3)
y = random.randint(0,3)
return X[x:3+x, y:3+y]
# we can loop over each component of the matrix by first splitting it
# off the last dimension:
b = [np.squeeze(x) for x in np.dsplit(a, a.shape[-1])]
# this will recreate the original matrix
c = np.dstack(b)
# so putting it together with the crop function
get_rand_matrix = [crop(np.squeeze(x)) for x in np.dsplit(a, a.shape[-1])]
desired_result = np.dstack(get_rand_matrix)
Upvotes: 1