Identify all unique combinations along the third dimension of stackd 2D numpy arrays

Question

For 2 or more 2D integer numpy arrays stacked along axis=0, I am interested in:

identifying all unique numerical combinations along the third dimension.
label each combination with new numerical value ('labels')
generate a new 2D array where the array values are the labels signifying the numerical value combination of the source arrays.

Sample data:

import numpy as np
arr1 = np.array(np.random.randint(low=0, high=4, size=25)).reshape(5,5)
arr2 = np.array(np.random.randint(low=0, high=4, size=25)).reshape(5,5)

A list of tuples of the combinations of interest can be obtained:

xx, yy = np.meshgrid(arr1, arr2, sparse=True)
combis = np.stack([xx.reshape(arr1.size), yy.reshape(arr2.size)])
u_combis = np.unique(combis, axis=1)
u_combis_lst = list(map(tuple, u_combis.T))

Generate dictionary to map each combination to a label:

labels = [x for x in range(0, len(u_combis_lst))]
label_dict = dict(zip(u_combis_lst, labels))

Now, bullet points 1 and 2 seem to be achieved. My questions are:

How can I apply label_dict to arr1 and arr2 combined?
How can my code suggestions be improved?
How can the code made to work with > 2 arrays?

To be complete, my aim is to recreate the functionality of the 'combine' function in Arcgis Pro.

JohanB · Accepted Answer

Another approach could be to create a dictionary lookup table based on the unique tuple combinations of the array values.

# start with flattened arrays
arr1 = np.random.randint(low=0, high=4, size=25)
arr2 = np.random.randint(low=0, high=4, size=25)

# create tuples and store the unique tuples
combis = list(zip(arr1, arr2)) 

u_combis = set(combis) # get unique combinations

# create a dictionary of the unique tuples with the unique values
u_combi_dict = {combi:n for n, combi in enumerate(u_combis)}

# use the unique dictionary combinations to match the tuples
combi_arr = np.array([u_combi_dict[combi] for combi in combis])

# if needed, reshape back to original extent for spatial analysis
combi_arr_grid = combi_arr.reshape(5, 5)

A generic function that can use an arbitrary number of input arrays could work as follows:

def combine(input_arrays):

    combis = list(zip(*input_arrays))
    u_combis = set(combis)

    u_combi_dict = {combi: n for n, combi in enumerate(u_combis)}
    combi_arr = np.array([u_combi_dict[combi] for combi in combis])

    return combi_arr

Identify all unique combinations along the third dimension of stackd 2D numpy arrays

Answers (2)

Related Questions