How to concatenate two numpy ndarrays without using concatenate

Question

I am writing code which utilizes Numba to JIT compile my python code. The function takes in two arrays of same length as input, randomly selects a slicing point and returns a tuple with two Frankenstein array formed by parts of the two input strings. Numba however does not yet support the numpy.concatenate function (don't know if it ever will). As I am unwilling to drop Numpy, does anyone know a performant solution for concatenating two Numpy arrays without the concatenate function?

def randomSlice(str1, str2):
    lenstr = len(str1)
    rnd = np.random.randint(1, lenstr)
    return (np.concatenate((str1[:rnd], str2[rnd:])), np.concatenate((str2[:rnd], str1[rnd:])))

JoshAdel · Accepted Answer

This might work for you:

import numpy as np
import numba as nb

@nb.jit(nopython=True)
def randomSlice_nb(str1, str2):
    lenstr = len(str1)
    rnd = np.random.randint(1, lenstr)

    out1 = np.empty_like(str1)
    out2 = np.empty_like(str1)

    out1[:rnd] = str1[:rnd]
    out1[rnd:] = str2[rnd:]

    out2[:rnd] = str2[:rnd]
    out2[rnd:] = str1[rnd:]
    return (out1, out2)

On my machine, using Numba 0.27 and timing via the timeit module to make sure I'm not counting the jit time in the stats (or you could run it once, and then time subsequent calls), the numba version gives a small but non-negligible performance increase on various size input arrays of ints or floats. If the arrays have a dtype of something like |S1, then numba is significantly slower. The Numba team has spent very little time optimizing non-numeric usecases so this isn't terribly surprising. I'm a little unclear about the exact form of your input arrays str1 and str2, so I can't exactly guarantee that the code will work for your specific usecase.

How to concatenate two numpy ndarrays without using concatenate

Answers (1)

Related Questions