Reputation: 947
I have a couple of numpy arrays:
orig = [[28021.22333333, 6585.53333333, 0. ],
[28021.22333333, 6585.53333333, 0. ],
[26723.52333333, 6587.48666667, 0. ],
[26723.52333333, 6587.48666667, 0. ],
[26063.11, 13089.56, 0. ],
[26063.11, 13089.56, 0. ],
[27424.91, 13091.4, 0. ],
[27424.91, 13091.4, 0. ],
[28833.60333333, 12641.65333333, 0. ],
[28833.60333333, 12641.65333333, 0. ],
[26125.33, 7954.18166667, 0. ],
[26125.33, 7954.18166667, 0. ],
[26121.29666667, 7956.72633333, 0. ],
[26121.29666667, 7956.72633333, 0. ],
[26116.26, 7957.80833333, 0. ],
[26116.26, 7957.80833333, 0. ],
[26110.98333333, 7957.263, 0. ],
[26110.98333333, 7957.263, 0. ],
[26106.27, 7955.17333333, 0. ],
[26106.27, 7955.17333333, 0. ],
[26102.84, 7951.85733333, 0. ],
[26102.84, 7951.85733333, 0. ]]
and
idxs = [ 0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 20, 21]
tri = [731, 703, 703, 731, 731, 731, 731, 693, 673, 699, 689, 731, 727, 731, 731, 731, 731, 731, 730]
pnts = [[28035.61081192, 6657.82528209, 2800. ],
[27951.42292993, 6561.84728091, 2800. ],
[28076.63625815, 6536.92743701, 2800. ],
[28139.0775588, 6773.36600593, 2800. ],
[27990.76839321, 6805.17674429, 2800. ],
[27856.70943257, 6734.2138896, 2800. ],
[27799.62835447, 6593.68175023, 2800. ],
[27846.23402973, 6449.33687603, 2800. ],
[27974.71914494, 6368.71983786, 2800. ],
[28124.96408673, 6389.55224384, 2800. ],
[28226.66757706, 6502.08637967, 2800. ],
[28232.24142249, 6653.66627254, 2800. ],
[28382.4101748, 6673.10904354, 2800. ],
[28315.56368133, 6812.44564901, 2800. ],
[28197.8230677, 6912.54705367, 2800. ],
[28049.54675563, 6956.10481526, 2800. ],
[27896.37306654, 6935.58740108, 2800. ],
[27764.78712281, 6854.54245845, 2800. ],
[27677.54132953, 6726.98339422, 2800. ]]
how to group now the values in idxs
, tri
and pnts
based on the values of idxs
which are indices to rows of orig
so that they correspond to the same value per row in orig
. For example I would like to get:
idxs = [[0,1], [2,3], [4,5], [7], [8,9], [10,11], [12,13], [14,15], [17], [18], [20,21]]
tri = [[731, 703], [703, 731], [731, 731], [731], [693, 673], [699, 689], [731, 727], [731, 731], [731], [731], [731, 730]]
and
pnts = [[[28035.61081192, 6657.82528209, 2800. ],
[27951.42292993, 6561.84728091, 2800. ]],
[[28076.63625815, 6536.92743701, 2800. ],
[28139.0775588, 6773.36600593, 2800. ]],
[[27990.76839321, 6805.17674429, 2800. ],
[27856.70943257, 6734.2138896, 2800. ]],
[[27799.62835447, 6593.68175023, 2800. ]],
[[27846.23402973, 6449.33687603, 2800. ],
[27974.71914494, 6368.71983786, 2800. ]],
[[28124.96408673, 6389.55224384, 2800. ],
[28226.66757706, 6502.08637967, 2800. ]],
[[28232.24142249, 6653.66627254, 2800. ],
[28382.4101748, 6673.10904354, 2800. ]],
[[28315.56368133, 6812.44564901, 2800. ],
[28197.8230677, 6912.54705367, 2800. ]],
[[28049.54675563, 6956.10481526, 2800. ]],
[[27896.37306654, 6935.58740108, 2800. ]],
[[27764.78712281, 6854.54245845, 2800. ],
[27677.54132953, 6726.98339422, 2800. ]]]
I tried to numpy.split()
but I couldn't really find the right condition to use. Also imagine that at the end I would have to apply the same on corresponding matrices with quite a few million inputs.
Upvotes: 0
Views: 161
Reputation: 12417
This is what you want:
import numpy_indexed as npi
eq = npi.group_by(orig[idxs])
print(eq.split(idxs))
print(eq.split(tri))
print(eq.split(pnts))
Obviously, you can sort them if you would like.
output:
#idxs
[array([0, 1]), array([20, 21]), array([8, 9]), array([14, 15]), array([2, 3]), array([17]), array([12, 13]), array([18]), array([4, 5]), array([7]), array([10, 11])]
#tri
[array([731, 703]), array([731, 730]), array([693, 673]), array([731, 731]), array([703, 731]), array([731]), array([731, 727]), array([731]), array([731, 731]), array([731]), array([699, 689])]
#pnts
[array([[28035.61081192, 6657.82528209, 2800. ],
[27951.42292993, 6561.84728091, 2800. ]]), array([[27764.78712281, 6854.54245845, 2800. ],
[27677.54132953, 6726.98339422, 2800. ]]), array([[27846.23402973, 6449.33687603, 2800. ],
[27974.71914494, 6368.71983786, 2800. ]]), array([[28315.56368133, 6812.44564901, 2800. ],
[28197.8230677 , 6912.54705367, 2800. ]]), array([[28076.63625815, 6536.92743701, 2800. ],
[28139.0775588 , 6773.36600593, 2800. ]]), array([[28049.54675563, 6956.10481526, 2800. ]]), array([[28232.24142249, 6653.66627254, 2800. ],
[28382.4101748 , 6673.10904354, 2800. ]]), array([[27896.37306654, 6935.58740108, 2800. ]]), array([[27990.76839321, 6805.17674429, 2800. ],
[27856.70943257, 6734.2138896 , 2800. ]]), array([[27799.62835447, 6593.68175023, 2800. ]]), array([[28124.96408673, 6389.55224384, 2800. ],
[28226.66757706, 6502.08637967, 2800. ]])]
And if you want to convert them to lists (Note that numpy does not accept non-rectangular arrays like the ones above):
print(sorted([l.tolist() for l in eq.split(idxs)]))
output:
[[0, 1], [2, 3], [4, 5], [7], [8, 9], [10, 11], [12, 13], [14, 15], [17], [18], [20, 21]]
Upvotes: 1