Reputation: 35
I am writing code which utilizes Numba to JIT compile my python code. The function takes in two arrays of same length as input, randomly selects a slicing point and returns a tuple with two Frankenstein array formed by parts of the two input strings. Numba however does not yet support the numpy.concatenate function (don't know if it ever will). As I am unwilling to drop Numpy, does anyone know a performant solution for concatenating two Numpy arrays without the concatenate function?
def randomSlice(str1, str2):
lenstr = len(str1)
rnd = np.random.randint(1, lenstr)
return (np.concatenate((str1[:rnd], str2[rnd:])), np.concatenate((str2[:rnd], str1[rnd:])))
Upvotes: 2
Views: 3025
Reputation: 68682
This might work for you:
import numpy as np
import numba as nb
@nb.jit(nopython=True)
def randomSlice_nb(str1, str2):
lenstr = len(str1)
rnd = np.random.randint(1, lenstr)
out1 = np.empty_like(str1)
out2 = np.empty_like(str1)
out1[:rnd] = str1[:rnd]
out1[rnd:] = str2[rnd:]
out2[:rnd] = str2[:rnd]
out2[rnd:] = str1[rnd:]
return (out1, out2)
On my machine, using Numba 0.27 and timing via the timeit
module to make sure I'm not counting the jit time in the stats (or you could run it once, and then time subsequent calls), the numba version gives a small but non-negligible performance increase on various size input arrays of ints or floats. If the arrays have a dtype of something like |S1
, then numba is significantly slower. The Numba team has spent very little time optimizing non-numeric usecases so this isn't terribly surprising. I'm a little unclear about the exact form of your input arrays str1
and str2
, so I can't exactly guarantee that the code will work for your specific usecase.
Upvotes: 1