reverse numpy reference cartesian product

Question

I'm using a function to get the cartesian product of 3 sets of arrays (cp1), which are derived from a column in a pandas dataframe. After obtaining this (cp2), I run a test to check if the sum is less than 1.05, and if it is I'd like find the combination from the original arrays that generated this from the position where it's true. Is there a way to do this with numpy / python / pandas? Any help would be appreciated. Ultimately I'd like to get the original indexes in each column in the dataframe that generated the true condition.

In [780]: cp1.shape

Out[780]: (8, 3)

In [781]: cp2.shape

Out[781]: (512, 3)

In [782]: cp2

Out[782]: array([[ 0.43478262,  0.33333334,  0.29411763],
                 [ 0.43478262,  0.33333334,  0.32258067],
                 [ 0.43478262,  0.33333334,  0.32786885],
                   ..., 
                 [ 0.43478262,  0.32258067,  0.32258067],
                 [ 0.43478262,  0.32258067,  0.29850748],
                 [ 0.43478262,  0.32258067,  0.32258067]])

In [783]: bools = cp2.sum(1) < 1.05

In [784]: np.where(bools)

Out[784]: (array([392, 398, 440, 446]),)

Dan D. · Accepted Answer

You could solve for ai and bi in ai * c + bi == idx where r < len(a) and c == len(b) via ai, bi = divmod(idx, len(b)). This is the inverse of index calculation.

Another option that is more direct but uses far more space is to take the Cartesian product of numpy.arange(len(a)) and numpy.arange(len(b)) and then index it with your indexes to get the indexes in the original arrays.

reverse numpy reference cartesian product

Answers (1)

Related Questions