ShanZhengYang
ShanZhengYang

Reputation: 17631

Comparing elements in a numpy array, finding pairs, how to deal with edges/corners

I'm trying to create a function below, but I'm not how to exactly to execute this.

Let's say I have a 2D numpy array like

import numpy as np
arr = np.array([[ 1,  2,  3,  4], [ 1,  6,  7,  8], [ 1,  1,  1, 12], [13,  3, 15, 16]])

This is a 4x4 matrix, which looks like this when printed:

array([[ 1,  2,  3,  4],
       [ 1,  6,  7,  8],
       [ 1,  1,  1, 12],
       [13,  3, 15, 16]])

I want to access the elements of arr and compare them to each other. For each element, I would like to see whether all surrounding eight elements (top, bottom, left, right, top-left, top-right, bottom-left, bottom-right) are greater than, less than, or equal to this element I'm at.

I thought about using a if statement in a function like this:

if arr[i][j] == arr[i][j+1]:
    print("Found a pair! %d is equal to %d, it's in location (%d, %d)", % (arr[i][j], arr[i][j+1], i, j+1))
elif: 
    arr[i][j] > arr[i][j+1]:
        print("%d is greater than %d, it's in location (%d, %d)", % (arr[i][j], arr[i][j+1], i, j+1))
else:
    print("%d is less than %d, it's in location (%d, %d)", % (arr[i][j], arr[i][j+1], i, j+1))

However, (1) I have to do this for all eight surrounding element positions and (2) I'm not sure how to write the function such that it moves from position to position correctly. Somehow one must use recursion for this to work, I think. One could possibly use a while loop as well.

I'm planning on saving all the "pairs" with are equal, and creating a dictionary with these.

EDIT1:

There is still a problem I'm having to understand where the dimensions are:

Our original matrix is shaped (4,4):

When we compare for adjacent pairs horizontally, we find an array shaped (4,3):

arr[:-1] == arr[1:]

#output 
array([[ True, False, False, False],
       [ True, False, False, False],
       [False, False, False, False]], dtype=bool)

When we compare for adjacent pairs vertically, we find an array shaped (3,4):

arr[:, :-1] == arr[:, 1:]
# output
array([[False, False, False],
       [False, False, False],
       [ True,  True, False],
       [False, False, False]], dtype=bool)

When I combine these two to see whether there are pairs both vertically and horizontally, how do I know I am not mixing up positions?

Upvotes: 1

Views: 3617

Answers (2)

Sammelsurium
Sammelsurium

Reputation: 516

Although I don't find it entirely clear what you want to do, adjacent array slices might be a convenient method. For example, arr[:-1] == arr[1:] will tell you where there are pairs in adjacent rows. Then, arr[arr[:-1] == arr[1:]] can give you an array of those values and argwhere can give you the indexes.

>>> import numpy as np
>>> arr
array([[3, 1, 0, 2, 3, 3],
       [2, 1, 2, 2, 3, 3],
       [2, 3, 0, 1, 1, 0],
       [2, 1, 3, 3, 1, 2]])

>>> hpairs = (arr[:, :-1] == arr[:, 1:])
>>> hpairs
array([[False, False, False, False,  True],
       [False, False,  True, False,  True],
       [False, False, False,  True, False],
       [False, False,  True, False, False]], dtype=bool)

>>> arr[hpairs]
array([3, 2, 3, 1, 3])

>>> np.argwhere(hpairs)
array([[0, 4],
       [1, 2],
       [1, 4],
       [2, 3],
       [3, 2]], dtype=int64)

Change the == operator and directions of slices as needed.

That we get a smaller array as a result of the comparison makes sense. After all, the number of possible horizontal pairs is one less than the array width. If either slice used for the comparison arr[:, :-1] == arr[:, 1:] is indexed with the boolean array, we get the left or the right numbers of the pairs. Analogously for the other directions.

What if there are pairs in multiple directions? That depends on what you want to do with them, I suppose. Let's say you want to find clusters of at least three equal numbers in the shape of an L turned 180 degrees. In other words, any position that is the upper of a vertical, and the right of a horizontal pair. (Same sample data as before.)

>>> vpairs = (arr[:-1] == arr[1:])
>>> hpairs[:-1] & vpairs[:, 1:]
array([[False, False, False, False,  True],
       [False, False, False, False, False],
       [False, False, False,  True, False]], dtype=bool)

If you want to count the number of equal neighbors at each position, here is one way to do it.

>>> backslashpairs = (arr[:-1, :-1] == arr[1:, 1:])
>>> slashpairs = (arr[1:, :-1] == arr[:-1, 1:])
>>> 
>>> equal_neighbors = np.zeros_like(arr, dtype=int)
>>> equal_neighbors[:-1] += vpairs
>>> equal_neighbors[1:] += vpairs
>>> equal_neighbors[:, :-1] += hpairs
>>> equal_neighbors[:, 1:] += hpairs
>>> equal_neighbors[1:, :-1] += slashpairs
>>> equal_neighbors[:-1, 1:] += slashpairs
>>> equal_neighbors[:-1, :-1] += backslashpairs
>>> equal_neighbors[1:, 1:] += backslashpairs
>>> equal_neighbors
array([[0, 1, 0, 2, 3, 3],
       [1, 1, 2, 2, 3, 3],
       [2, 1, 0, 2, 2, 0],
       [1, 0, 2, 1, 2, 0]])

Upvotes: 2

user707650
user707650

Reputation:

There may be some nice numpy or scipy function out there that does this, but not one I know of.

Below is one solution to solve this.

To add some confusion to it, I have indexed the rows with x, and the columns with y. That simply means that the element at (2, 1) is 7.

The trick with edges and corners is simply to expand the matrix with a border, that is later ignored.

import numpy as np
arr = np.array([[1, 2, 3, 4], [1, 6, 7, 8], [1, 2, 3, 12], [13, 3, 15, 16]])
arr2 = np.zeros((arr.shape[0]+2, arr.shape[1]+2), dtype=arr.dtype)
arr2[1:-1,1:-1] = arr
results = np.zeros(arr2.shape + (9,), dtype=np.int)
print(arr)

transform = {'y': [-1, 0, 1, -1, 1, -1, 0, 1],
             'x': [-1, -1, -1, 0, 0, 1, 1, 1]}
for x in range(1, arr2.shape[0]-1):
    for y in range(1, arr2.shape[1]-1):
        subarr = arr2[x-1:x+2,y-1:y+2].flatten()
        mid = len(subarr)//2
        value = subarr[mid]
        greater = (subarr > value).astype(np.int)
        smaller = (subarr < value).astype(np.int)
        results[x,y,:] += greater
        results[x,y,:] -= smaller

results = np.dstack((results[1:-1,1:-1,:4], results[1:-1,1:-1,5:]))
xpos, ypos, zpos = np.where(results == 0)
matches = []
for x, y, z in zip(xpos, ypos, zpos):
    matches.append(((x, y), x+transform['x'][z], y+transform['y'][z]))
print(matches)

which results in

[[ 1  2  3  4]
 [ 1  1  7  8]
 [ 1  2  3 12]
 [13  3 15 16]]
[((0, 0), 1, 0), ((0, 0), 1, 1), ((1, 0), 0, 0), ((1, 0), 1, 1), ((1, 0), 2, 0), ((1, 1), 0, 0), ((1, 1), 1, 0), ((1, 1), 2, 0), ((2, 0), 1, 0), ((2, 0), 1, 1), ((2, 2), 3, 1), ((3, 1), 2, 2)]

In the above code, I store neighbouring matches for equal, greater or larger than as 0, 1 or -1 in a z-dimension. By using a simple transformation, the index of the z-dimension translates to an offset from the point under consideration.

The dstack step is not really necessary, but it gets rid of both the added border, and the self-matches (there's no simply way to "slice out" an element in the middle of an array).

Pairs for greater or smaller than matches can be found by simply chaning the where condition, since those matches are stored as 1 or -1 in the results array.

I am not using a dict to store the result, since this is essentially not possible: a single point can have multiple matches: a dict could only store one match for a single point (using a (x, y) coordinate tuple as key). Hence the matches are stored in a list, with each element a tuple of

((x, y), (xmatch, ymatch))

tuples

Since each pair is matched both ways, all matching pairs are contained twice in matches.

Upvotes: 1

Related Questions