Is there a way to slice a 2d array in numpy into smaller 2d arrays? Example [[1,2,3,4], -> [[1,2] [3,4] [5,6,7,8]] [5,6] [7,8]] So I basically want to cut down a 2x4 array into 2 2x2 arrays. Looking for a generic solution to be used on images.

Slice 2d array into smaller 2d arrays

Answers (12)

Reputation: 2162

Method 1 :

import numpy as np
from skimage.util import view_as_blocks

arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8]])

# Define block shape
block_shape = (2, 2)

# Slice the array into blocks
blocks = view_as_blocks(arr, block_shape)

print(blocks)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

Method 2(concise) :

import numpy as np
from numpy.lib.stride_tricks import as_strided

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Block size
block_shape = (2, 2)

a_shape = np.array(a.shape)
print(a_shape)#[2 4]
#convert the arrays to lists
new_shape = (a_shape // block_shape).tolist() + list(block_shape)
print(new_shape)#[1,2,2, 2]


new_strides = (a.strides[0] * block_shape[0], a.strides[1] * block_shape[1]) + a.strides 
print(new_strides)
res = as_strided(a, shape = new_shape, strides = new_strides)
print(res)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]
'''

Method 2 :

import numpy as np
from numpy.lib.stride_tricks import as_strided

a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
# Block size
block_shape = (2, 2)

shape = a.shape
strides = a.strides

newShape1 =( shape[0] // block_shape[0] ) 
newShape2 =( shape[1] // block_shape[1] ) 

newShape = (newShape1,newShape2,block_shape[0],block_shape[1])
print(newShape)#(1, 2, 2, 2)

newStrides1 = strides[0] * block_shape[0]
newStrides2 = strides[1] * block_shape[1]

newStrides = (newStrides1, newStrides2,strides[0],strides[1] )
print(newStrides) #(32, 8, 16, 4)

blocks = as_strided(a, shape = newShape, strides = newStrides)
print(blocks)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

Einsum :

import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Define block shape
block_shape = (2, 2)

grsize = 4
halfsize = grsize // 2

reshaped = a.reshape(-1, grsize)
aa = np.einsum('ij -> ij', reshaped[:, :halfsize])
bb = np.einsum('ij -> ij', reshaped[:, halfsize:])

# Stack aa and bb along a new axis
combined = np.stack([aa, bb], axis=0)
print(combined)
'''
[[[1 2]
  [5 6]]

 [[3 4]
  [7 8]]]
'''

# Reshape to the desired 4D shape
final_output = combined.reshape(1, 2, 2, 2)

print(final_output)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

tensordot :

import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Define block shape
block_shape = (2, 2)
grsize = 4
halfsize = grsize // 2
reshaped = a.reshape(-1, grsize)
# Split the reshaped array into two halves using tensordot
aa = np.tensordot(a[:, :halfsize], np.ones((1,), dtype=int), axes=0)
bb = np.tensordot(a[:, halfsize:], np.ones((1,), dtype=int), axes=0)

# Stack aa and bb along a new axis
combined = np.stack([aa, bb], axis=0)
print(combined)
# Reshape to the desired 4D shape
final_output = combined.reshape(1, 2, 2, 2)

print(final_output)
'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

Method 5 :

import numpy as np

a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8]])

# Block size
block_shape = (2, 2)

blockVertical = a.shape[0] // block_shape[0]

blockHorizontal = a.shape[0] // blockVertical

reshapedArray1 = a.reshape(blockVertical,blockHorizontal,*block_shape).swapaxes(1,2)
print(reshapedArray1)

'''
[[[[1 2]
   [5 6]]

  [[3 4]
   [7 8]]]]

'''

Upvotes: 0

unutbu

Reputation: 881027

There was another question a couple of months ago which clued me in to the idea of using reshape and swapaxes. The h//nrows makes sense since this keeps the first block's rows together. It also makes sense that you'll need nrows and ncols to be part of the shape. -1 tells reshape to fill in whatever number is necessary to make the reshape valid. Armed with the form of the solution, I just tried things until I found the formula that works.

You should be able to break your array into "blocks" using some combination of reshape and swapaxes:

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
    assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))

turns c

np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)

[out]:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

into

print(blockshaped(c, 2, 3))

[out]:
[[[ 0  1  2]
  [ 6  7  8]]

 [[ 3  4  5]
  [ 9 10 11]]

 [[12 13 14]
  [18 19 20]]

 [[15 16 17]
  [21 22 23]]]

I've posted an inverse function, unblockshaped, here, and an N-dimensional generalization here. The generalization gives a little more insight into the reasoning behind this algorithm.

Note that there is also superbatfish's blockwise_view. It arranges the blocks in a different format (using more axes) but it has the advantage of (1) always returning a view and (2) being capable of handling arrays of any dimension.

Upvotes: 110

Yev Guyduy

Reputation: 1569

to add to @Aenaon answer and his blockfy function, if you are working with COLOR IMAGES/ 3D ARRAY here is my pipeline to create crops of 224 x 224 for 3 channel input

def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc

A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a

block_list = []
previous_row = 0
for row_block in range(bpc):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(bpr):
        previous_column = column_block * q
        block = A[previous_row:previous_row+p, previous_column:previous_column+q]

        # remove nan columns and nan rows
        nan_cols = np.all(np.isnan(block), axis=0)
        block = block[:, ~nan_cols]
        nan_rows = np.all(np.isnan(block), axis=1)
        block = block[~nan_rows, :]

        ## append
        if block.size:
            block_list.append(block)

return block_list

then extended above to

for file in os.listdir(path_to_crop):   ### list files in your folder
   img = io.imread(path_to_crop + file, as_gray=False) ### open image 

   r = blockfy(img[:,:,0],224,224)  ### crop blocks of 224 x 224 for red channel
   g = blockfy(img[:,:,1],224,224)  ### crop blocks of 224 x 224 for green channel
   b = blockfy(img[:,:,2],224,224)  ### crop blocks of 224 x 224 for blue channel

   for x in range(0,len(r)):
       img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch

       img = img.astype(np.uint8) ### cast back to proper integers

       img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
       
       img_swap_2 = img_swap.swapaxes(0, 1) ### do it again

       Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
                                        format = 'jpeg',
                                        subsampling=0,
                                        quality=100) ### save patch with new name etc

Upvotes: 0

serwus

Reputation: 165

I publish my solution. Notice that this code doesn't' actually create copies of original array, so it works well with big data. Moreover, it doesn't crash if array cannot be divided evenly (but you can easly add condition for that by deleting ceil and checking if v_slices and h_slices are divided without rest).

import numpy as np
from math import ceil

a = np.arange(9).reshape(3, 3)

p, q = 2, 2
width, height = a.shape

v_slices = ceil(width / p)
h_slices = ceil(height / q)

for h in range(h_slices):
    for v in range(v_slices):
        block = a[h * p : h * p + p, v * q : v * q + q]
        # do something with a block

This code changes (or, more precisely, gives you direct access to part of an array) this:

[[0 1 2]
 [3 4 5]
 [6 7 8]]

Into this:

[[0 1]
 [3 4]]
[[2]
 [5]]
[[6 7]]
[[8]]

If you need actual copies, Aenaon code is what you are looking for.

If you are sure that big array can be divided evenly, you can use numpy splitting tools.

Upvotes: 0

Dawn

Reputation: 3638

a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)

yields

[[7 6 2 4 4 2 5 2 3]
 [2 3 7 6 8 8 2 6 2]
 [4 1 3 1 3 8 1 3 7]
 [6 1 1 5 7 2 1 5 8]
 [8 8 7 6 6 1 8 8 4]
 [6 1 8 2 1 4 5 1 8]
 [7 3 4 2 5 6 1 2 7]
 [4 6 7 5 8 2 8 2 8]
 [6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
       [2, 3, 7],
       [4, 1, 3]]), array([[4, 4, 2],
       [6, 8, 8],
       [1, 3, 8]]), array([[5, 2, 3],
       [2, 6, 2],
       [1, 3, 7]])], [array([[6, 1, 1],
       [8, 8, 7],
       [6, 1, 8]]), array([[5, 7, 2],
       [6, 6, 1],
       [2, 1, 4]]), array([[1, 5, 8],
       [8, 8, 4],
       [5, 1, 8]])], [array([[7, 3, 4],
       [4, 6, 7],
       [6, 6, 5]]), array([[2, 5, 6],
       [5, 8, 2],
       [5, 6, 1]]), array([[1, 2, 7],
       [8, 2, 8],
       [2, 6, 4]])]]

Upvotes: 2

Aenaon

Reputation: 3593

Some minor enhancement to TheMeaningfulEngineer's answer that handles the case when the big 2d array cannot be perfectly sliced into equally sized subarrays

def blockfy(a, p, q):
    '''
    Divides array a into subarrays of size p-by-q
    p: block row size
    q: block column size
    '''
    m = a.shape[0]  #image row size
    n = a.shape[1]  #image column size

    # pad array with NaNs so it can be divided by p row-wise and by q column-wise
    bpr = ((m-1)//p + 1) #blocks per row
    bpc = ((n-1)//q + 1) #blocks per column
    M = p * bpr
    N = q * bpc

    A = np.nan* np.ones([M,N])
    A[:a.shape[0],:a.shape[1]] = a

    block_list = []
    previous_row = 0
    for row_block in range(bpc):
        previous_row = row_block * p   
        previous_column = 0
        for column_block in range(bpr):
            previous_column = column_block * q
            block = A[previous_row:previous_row+p, previous_column:previous_column+q]

            # remove nan columns and nan rows
            nan_cols = np.all(np.isnan(block), axis=0)
            block = block[:, ~nan_cols]
            nan_rows = np.all(np.isnan(block), axis=1)
            block = block[~nan_rows, :]

            ## append
            if block.size:
                block_list.append(block)

    return block_list

Examples:

a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)

a->
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

out[0] ->
array([[0., 1., 2.],
       [5., 6., 7.]])

out[1]->
array([[3., 4.],
       [8., 9.]])

out[-1]->
array([[23., 24.]])

Upvotes: 3

snoob dogg

Reputation: 2885

Here is a solution based on unutbu's answer that handle case where matrix cannot be equally divided. In this case, it will resize the matrix before using some interpolation. You need OpenCV for this. Note that I had to swap ncols and nrows to make it works, didn't figured why.

import numpy as np
import cv2
import math 

def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
    """
    arr      a 2D array, typically an image
    r_nbrs   numbers of rows
    r_cols   numbers of cols
    """

    arr_h, arr_w = arr.shape

    size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
    size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )

    if size_w != arr_w or size_h != arr_h:
        arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)

    nrows = int(size_w // r_nbrs)
    ncols = int(size_h // c_nbrs)

    return (arr.reshape(r_nbrs, ncols, -1, nrows) 
               .swapaxes(1,2)
               .reshape(-1, ncols, nrows))

Upvotes: 1

warmspringwinds

Reputation: 1177

If you want a solution that also handles the cases when the matrix is not equally divided, you can use this:

from operator import add
half_split = np.array_split(input, 2)

res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)

Upvotes: 2

Saullo G. P. Castro

Reputation: 59015

You question practically the same as this one. You can use the one-liner with np.ndindex() and reshape():

def cutter(a, r, c):
    lenr = a.shape[0]/r
    lenc = a.shape[1]/c
    np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)

To create the result you want:

a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
#       [5, 6, 7, 8]])

cutter( a, 1, 2 )
#array([[[[1, 2]],
#        [[3, 4]]],
#       [[[5, 6]],
#        [[7, 8]]]])

Upvotes: 3

JAB

Reputation: 21089

There are some other answers that seem well-suited for your specific case already, but your question piqued my interest in the possibility of a memory-efficient solution usable up to the maximum number of dimensions that numpy supports, and I ended up spending most of the afternoon coming up with possible method. (The method itself is relatively simple, it's just that I still haven't used most of the really fancy features that numpy supports so most of the time was spent researching to see what numpy had available and how much it could do so that I didn't have to do it.)

def blockgen(array, bpa):
    """Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
    bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray

    # parameter checking
    if array.ndim != bpa.size:         # bpa doesn't match array dimensionality
        raise ValueError("Size of bpa must be equal to the array dimensionality.")
    if (bpa.dtype != np.int            # bpa must be all integers
        or (bpa < 1).any()             # all values in bpa must be >= 1
        or (array.shape % bpa).any()): # % != 0 means not evenly divisible
        raise ValueError("bpa ({0}) must consist of nonzero positive integers "
                         "that evenly divide the corresponding array axis "
                         "size".format(bpa))


    # generate block edge indices
    rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
            for i, blk_n in enumerate(bpa))

    # build slice sequences for each axis (unfortunately broadcasting
    # can't be used to make the items easy to operate over
    c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]

    # Now to get the blocks; this is slightly less efficient than it could be
    # because numpy doesn't like jagged arrays and I didn't feel like writing
    # a ufunc for it.
    for idxs in np.ndindex(*bpa):
        blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))

        yield array[blockbounds]

Upvotes: 7

Francesco Montesano

Reputation: 8668

It seems to me that this is a task for numpy.split or some variant.

e.g.

a = np.arange(30).reshape([5,6])  #a.shape = (5,6)
a1 = np.split(a,3,axis=1) 
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)

If you have a NxN image you can create, e.g., a list of 2 NxN/2 subimages, and then divide them along the other axis.

numpy.hsplit and numpy.vsplit are also available.

Upvotes: 8

TheMeaningfulEngineer

Reputation: 16359

For now it just works when the big 2d array can be perfectly sliced into equally sized subarrays.

The code bellow slices

a ->array([[ 0,  1,  2,  3,  4,  5],
           [ 6,  7,  8,  9, 10, 11],
           [12, 13, 14, 15, 16, 17],
           [18, 19, 20, 21, 22, 23]])

into this

block_array->
    array([[[ 0,  1,  2],
            [ 6,  7,  8]],

           [[ 3,  4,  5],
            [ 9, 10, 11]],

           [[12, 13, 14],
            [18, 19, 20]],

           [[15, 16, 17],
            [21, 22, 23]]])

p ang q determine the block size

Code

a = arange(24)
a = a.reshape((4,6))
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

p = 2     #block row size
q = 3     #block column size

block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(blocks_per_column):
        previous_column = column_block * q
        block = a[previous_row:previous_row+p,previous_column:previous_column+q]
        block_array.append(block)

block_array = array(block_array)

Upvotes: 2

Slice 2d array into smaller 2d arrays

Answers (12)

Related Questions