ignacio.saravia
ignacio.saravia

Reputation: 327

Get Coordinates of duplicated numbers in matrix

I need ideas, this is not a homework .......

I have the following matrix, how i can obtain the coordinates of duplicates numbers,

imagen

duplicates [ [ [0,0],[1,0],[2,0],[0,1],[0,2],[1,2],[1,3],[2,2] ], [ [0,3], [0,4] , ........ ]

momentary code:

n = [ [1,1,1,3,3],[1,2,1,1,2],[1,2,1,3,3] ]

coordinates = [[-1,0],[0,1],[1,0],[0,-1]]
i = 0
while i < len(n):
    j = 0
    fila = []
    while j < len(n[0]):
        for coordinate in coordinates:
            X = i + coordinate[0]
            Y = j + coordinate[1]
            if X >= 0 and X < len(n) and Y >= 0 and Y < len(n[0]):
                # I tried with two for in here but its not efficient
        j+=1
    i+=1

Thanks

EDIT:

expected output:

Groups 1, separated 2, separated 3 and only 2

duplicates [ [ [0,0],[1,0],[2,0],[0,1],[0,2],[1,2],[1,3],[2,2] ], [ [0,3], [0,4] ], [ [1,1],[2,1] ] , [ [1,4] ], [ [2,3], [2,4] ]

Upvotes: 1

Views: 424

Answers (3)

PM 2Ring
PM 2Ring

Reputation: 55499

Here's a non-Numpy solution. It uses itertools.groupby to separate the coordinates into lists that have the that share the same value in n, then it separate those into sublists consisting of neighbours.

from itertools import product, groupby

n = [
    [1, 1, 1, 3, 3],
    [1, 2, 1, 1, 2],
    [1, 2, 1, 3, 3],
]

# Return a value from `n` given a coordinate tuple
def keyfunc(t):
    y, x = t
    return n[y][x]

# Make a list of coordinate tuples. 
# This code assumes that all rows in `n` have the same length
a = list(product(range(len(n)), range(len(n[0]))))

# Sort coordinates by the corresponding value in `n`
a.sort(key=keyfunc)

#Split the sorted coordinates into lists that share the same value...
groups = []
for k, g in groupby(a, key=keyfunc):
    # ... and separate those into sublists consisting of neighbours
    # We must turn g into a proper list
    sublists = []
    for t in list(g):
        y, x = t
        neighbours = [(y-1, x), (y+1, x), (y, x-1), (y, x+1)]
        # Search for a sublist that contains a neighbour of `t`
        for lst in sublists:
            if any(u in lst for u in neighbours):
                lst.append(t)
                break
        else:
            # No sublist of neighbours found, start a new one
            sublists.append([t])
    groups.append((k, sublists))

for row in groups:
    print(row)

output

(1, [[(0, 0), (0, 1), (0, 2), (1, 0), (1, 2), (1, 3), (2, 0), (2, 2)]])
(2, [[(1, 1), (2, 1)], [(1, 4)]])
(3, [[(0, 3), (0, 4)], [(2, 3), (2, 4)]])

Upvotes: 1

Oliver W.
Oliver W.

Reputation: 13459

Not extremely efficient, as the test array is used in a logical comparison check multiple times, but most loops are pushed to the C-level. At the same time, it demonstrates a few interesting methods you could explore if you have similar functionality to perform.

import scipy.ndimage as nd    
import numpy as np

n = np.array([ [1,1,1,3,3],[1,2,1,1,2],[1,2,1,3,3] ], dtype=np.int)

def print_coord(val, pos, shape):
    print("{}: {}".format(val[0], list(zip(*np.unravel_index(pos, dims=shape)))))
    return 0

for val in np.unique(n):
    labelled_array, num_features = nd.label(n == val)
    nd.labeled_comprehension(n, labelled_array, [1,2,3], 
        lambda v, p: print_coord(v, p, shape=n.shape), float, 0, True)

Output:

1: [(0, 0), (0, 1), (0, 2), (1, 0), (1, 2), (1, 3), (2, 0), (2, 2)]
2: [(1, 1), (2, 1)]
2: [(1, 4)]
3: [(0, 3), (0, 4)]
3: [(2, 3), (2, 4)]

You could of course append the results to a list, when you are not interested in keeping the label that matched that list of coordinates.

Upvotes: 3

Rustem K
Rustem K

Reputation: 1242

According to an output you provided to your example, I came up with the following solution:

import sys

def findMax(n, m, mat):
    max = -sys.maxint - 1
    for i in xrange(n):
        for j in xrange(m):
            if mat[i][j] > max:
                max = mat[i][j]
    return max


def doIt(mat):
    n, m = len(mat), len(mat[0])
    max_val = findMax(n, m, mat)
    memo = [None] * (max_val + 1)
    for i in xrange(n):
        for j in xrange(m):
            if memo[mat[i][j]] is None:
                memo[mat[i][j]] = [(i, j)]
            else:
                memo[mat[i][j]].append((i, j))
    return {i: item for i, item in enumerate(memo) if item is not None and len(item) > 1}

if __name__ == '__main__':
    n = [ [1,1,1,3,3],[1,2,1,1,2],[1,2,1,3,3] ]
    print doIt(n)

memo is a list where index indicate a potential duplicate number and corresponding value is the list of coordinates. Originally memo is initialized with None of length max found by traversing input matrix (O(N^2)). Key observation - If there are more than one pair of coordinates for specific number in memo, then we found a duplicate.

Upvotes: 1

Related Questions