aph
aph

Reputation: 1855

Element-wise weave of two arrays used with numpy repeat

I have two arrays of unequal length val1 and val2 that I am trying to weave together in a specific way that is defined by the equal-length arrays mult1 and mult2. In general, my arrays are very long (~1e6 elements), and this is a performance-critical bottleneck in my calculation, so I cannot afford to do a python-for loop and so I am trying to take advantage of vectorized functions in Numpy. For the sake of being explicit:

mult1 = np.array([0, 1, 2, 1, 0])
mult2 = np.array([1, 0, 1, 1, 0])

val1 = np.array([1, 2, 3, 4])
val2 = np.array([-1, -2, -3])

desired_final_result = np.array([-1, 1, 2, 3, -2, 4, -3])

The weaving of val1 and val2 is defined by the following element-wise procession through the indices of mult1 and mult2. Each entry of the two mult arrays defines how many elements to choose from the corresponding val array. We proceed element-wise through the mult arrays; the value of mult1[i] determines how many entries we choose from val1, then we proceed to the value of mult2[i] to select the appropriate number of val2 entries, always choosing the val1 entries to come first for each index i.

Note that len(val1) = mult1.sum() and len(val2) = mult2.sum(), so we always end up with a final array with len(desired_final_result) = len(val1) + len(val2).

Explicit explanation of minimal example

Question

Is there a way to use vectorized functions such as np.repeat and/or np.choose to solve my problem? Or do I need to resort to coding this calculation up in C and wrapping it into python?

Upvotes: 2

Views: 521

Answers (2)

user2379410
user2379410

Reputation:

Creating a Boolean index into the result array:

mult = np.array([mult1, mult2]).ravel('F')
tftf = np.tile([True, False], len(mult1))
mask = np.repeat(tftf, mult)

result = np.empty(len(val1) + len(val2), int)
result[ mask] = val1
result[~mask] = val2

Edit - I believe this works too:

idx = np.repeat(mult1.cumsum(), mult2)
result = np.insert(val1, idx, val2)

It's short, but it may not be faster.

Upvotes: 4

user2357112
user2357112

Reputation: 282094

This can be done with NumPy routines, but the best I've come up with is pretty clumsy:

reps = numpy.empty([len(mult1)*2], dtype=int)
reps[::2] = mult1
reps[1::2] = mult2

to_repeat = numpy.empty_like(reps)
to_repeat[::2] = -1   # Avoid using 0 and 1 in case either of val1 or val2 is empty
to_repeat[1::2] = -2

indices = numpy.repeat(to_repeat, reps)
indices[indices==-1] = numpy.arange(len(val1))
indices[indices==-2] = numpy.arange(len(val1), len(val1) + len(val2))

final_result = numpy.concatenate([val1, val2])[indices]

Upvotes: 2

Related Questions