Reputation: 1341
I'm using a function to get the cartesian product of 3 sets of arrays (cp1), which are derived from a column in a pandas dataframe. After obtaining this (cp2), I run a test to check if the sum is less than 1.05, and if it is I'd like find the combination from the original arrays that generated this from the position where it's true. Is there a way to do this with numpy / python / pandas? Any help would be appreciated. Ultimately I'd like to get the original indexes in each column in the dataframe that generated the true condition.
In [780]: cp1.shape
Out[780]: (8, 3)
In [781]: cp2.shape
Out[781]: (512, 3)
In [782]: cp2
Out[782]: array([[ 0.43478262, 0.33333334, 0.29411763],
[ 0.43478262, 0.33333334, 0.32258067],
[ 0.43478262, 0.33333334, 0.32786885],
...,
[ 0.43478262, 0.32258067, 0.32258067],
[ 0.43478262, 0.32258067, 0.29850748],
[ 0.43478262, 0.32258067, 0.32258067]])
In [783]: bools = cp2.sum(1) < 1.05
In [784]: np.where(bools)
Out[784]: (array([392, 398, 440, 446]),)
Upvotes: 0
Views: 249
Reputation: 74655
You could solve for ai
and bi
in ai * c + bi == idx
where r < len(a)
and c == len(b)
via ai, bi = divmod(idx, len(b))
. This is the inverse of index calculation.
Another option that is more direct but uses far more space is to take the Cartesian product of numpy.arange(len(a))
and numpy.arange(len(b))
and then index it with your indexes to get the indexes in the original arrays.
Upvotes: 1