acdr
acdr

Reputation: 4706

Performantly argsort two pre-sorted numpy arrays

I've got several groups of arrays. Within each group, all arrays are one-dimensional, all with the same length. Within each group is one primary array, which is already sorted.

For example:

grp_1 = [
    np.array([10, 20, 30, 40]),
    np.array(["A", "C", "E", "G"]),
    ]

grp_2 = [
    np.array([15, 25, 35]),
    np.array(["Z", "Y", "X"]),
    ]

Now I want to merge corresponding elements in my groups. I want this to happen in such a way so as to make the primary array of the result to be sorted (in a stable manner). For example:

def combine_groups(groups):
    combined_arrays = [np.concatenate([grp[idx] for grp in groups]) for idx in range(len(groups[0]))]
    sort_indices = np.argsort(combined_arrays[0], kind="mergesort")
    # Merge sort rather than quicksort because the former is stable
    return [arr[sort_indices] for arr in combined_arrays]

This works and is fine and dandy, but (for arrays much larger than this example) it's much slower than it needs to be. Merge sort is O(N log(N)), whereas merging arrays that are already sorted should be a O(N) affair.

I came across the cytoolz packages, which has a merge_sorted package which blows numpy out of the water when it comes to sorting my primary arrays. Unfortunately, I need to get the resulting indices, so that I can transform the non-primary arrays as well.

So: is the above possible in a faster way than using numpy's argsort?

Upvotes: 4

Views: 416

Answers (1)

tel
tel

Reputation: 13999

tl;dr

Just use merge sort like you're doing. Previous discussions and benchmarks of similar questions all suggest that you're not going to beat the method you're already using without writing some cython code yourself (and maybe not even then).

Methods without merge sort

Just zip your groups and then use cytoolz.merge_sorted:

from cytoolz import merge_sorted

# it will be an iterator that yields (10, 'A'), (15, 'Z'), (20, 'C'), (25, 'Y'), (30, 'E'), (35, 'X'), (40, 'G')
it = merge_sorted(zip(*grp_1), zip(*grp_2))

# unzip the tuples back into your desired arrays
grp_comb = [np.array(d) for d in zip(*it)]
print(grp_comb)

Output:

[array([10, 15, 20, 25, 30, 35, 40]), array(['A', 'Z', 'C', 'Y', 'E', 'X', 'G'], dtype='<U1')]

Alternatively, if you really want to combine your groups via an indirect sort like numpy.argsort, you can use ndarray.searchsorted:

ix = grp_1[0].searchsorted(grp_2[0])
grp_comb= [np.insert(grp_1[i], ix, grp_2[i]) for i in range(2)]
print(grp_comb)

Output:

[array([10, 15, 20, 25, 30, 35, 40]), array(['A', 'Z', 'C', 'Y', 'E', 'X', 'G'], dtype='<U1')]

Testing/timing

I used the following code to test if my answers produces output identical to the combine_groups function you posted, and to time the various methods:

from cytoolz import merge_sorted
import numpy as np
from numpy.testing import assert_array_equal

grp_1 = [
    np.array([10, 20, 30, 40]),
    np.array(["A", "C", "E", "G"]),
    ]

grp_2 = [
    np.array([15, 25, 35]),
    np.array(["Z", "Y", "X"]),
    ]

def combine_groups(*groups):
    combined_arrays = [np.concatenate([grp[idx] for grp in groups]) for idx in range(len(groups[0]))]
    sort_indices = np.argsort(combined_arrays[0], kind="mergesort")
    # Merge sort rather than quicksort because the former is stable
    return [arr[sort_indices] for arr in combined_arrays]

def combine_groups_ms(*groups):
    it = merge_sorted(*(zip(*g) for g in groups))
    return [np.array(d) for d in zip(*it)]

def combine_groups_ssi(g0, g1):
    ix = g0[0].searchsorted(g1[0])
    return [np.insert(g0[i], ix, g1[i]) for i in range(len(g0))]

expected = combine_groups(grp_1, grp_2)
assert_array_equal(combine_groups_ms(grp_1, grp_2), expected)
assert_array_equal(combine_groups_ssi(grp_1, grp_2), expected)

Here's the timings:

%%timeit
combine_groups(grp_1, grp_2)
6.84 µs ± 154 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%%timeit
combine_groups_ms(grp_1, grp_2)
10.4 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%%timeit
combine_groups_ssi(grp_1, grp_2)
36.3 µs ± 1.27 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

So your initial attempt using concatenate followed by merge sort actually is faster than the code I wrote that takes direct advantage of the pre-sortedness. Similar questions have been asked before on SO, and have produced similar benchmarks. Looking at the details of the merge sort algorithm, I think this might be down to the fact that merging two sorted lists is a single step away from merge sort's best case performance scenario.

Upvotes: 2

Related Questions