Matts
Matts

Reputation: 1341

reverse numpy reference cartesian product

I'm using a function to get the cartesian product of 3 sets of arrays (cp1), which are derived from a column in a pandas dataframe. After obtaining this (cp2), I run a test to check if the sum is less than 1.05, and if it is I'd like find the combination from the original arrays that generated this from the position where it's true. Is there a way to do this with numpy / python / pandas? Any help would be appreciated. Ultimately I'd like to get the original indexes in each column in the dataframe that generated the true condition.

In [780]: cp1.shape

Out[780]: (8, 3)

In [781]: cp2.shape

Out[781]: (512, 3)

In [782]: cp2

Out[782]: array([[ 0.43478262,  0.33333334,  0.29411763],
                 [ 0.43478262,  0.33333334,  0.32258067],
                 [ 0.43478262,  0.33333334,  0.32786885],
                   ..., 
                 [ 0.43478262,  0.32258067,  0.32258067],
                 [ 0.43478262,  0.32258067,  0.29850748],
                 [ 0.43478262,  0.32258067,  0.32258067]])

In [783]: bools = cp2.sum(1) < 1.05

In [784]: np.where(bools)

Out[784]: (array([392, 398, 440, 446]),)

Upvotes: 0

Views: 249

Answers (1)

Dan D.
Dan D.

Reputation: 74655

You could solve for ai and bi in ai * c + bi == idx where r < len(a) and c == len(b) via ai, bi = divmod(idx, len(b)). This is the inverse of index calculation.

Another option that is more direct but uses far more space is to take the Cartesian product of numpy.arange(len(a)) and numpy.arange(len(b)) and then index it with your indexes to get the indexes in the original arrays.

Upvotes: 1

Related Questions