Numpy vectorize for a lambda with multiple arguments

Question

I tried for a while last night to use the various different numpy vectorisation functions for a task (which to be clear is perfectly doable without them), to create linear interpolations between points.

Let's say I have a vector of floats (let's call them "points"),

v = np.array([9. , 1. , 4.2, 5.6, 3. , 4.6])

I want to interpolate between adjacent points, so I need to take these pairs:

def adjacent_pairs(v):
    """
    Given a 1D numpy array `v = np.array([1, ..., n])`, return a 2D numpy array of
    adjacent pairs, `np.array([(1,2), ..., (n-1,n)])`.
    """
    s = v.shape
    d = len(s)
    assert d == 1, ValueError(f"Vector must be 1D - got a {d}D vector: shape = {s})")
    return np.vstack([v[:-1],v[1:]]).T

adjacent_pairs(v) gives:

array([[9. , 1. ],
       [1. , 4.2],
       [4.2, 5.6],
       [5.6, 3. ],
       [3. , 4.6]])

I want to interpolate these pairs (the rows of the matrix, e.g. [9., 1.]) by intervals of size 0.2, but the interpolations may be ascending or descending, so I normalise the difference vector to find the 'direction' or sign (+1 if ascending, -1 for descending) and multiply this by the step size to pass to arange as the step argument.

This works:

def interpolate_1d(v, step=0.2):
    v_adj = adjacent_pairs(v)
    d = np.diff(v_adj) / np.abs(np.diff(v_adj))
    interpolated = [np.arange(*r, diff * step) for r, diff in zip(v_adj, d)]
    return interpolated

However I'm conscious that the zip() part isn't "in" numpy, and perhaps I should be doing it in a way that is.

I started looking at the various 'vectorised' functions in numpy (which as I understand it can sometimes speed up your code), but I'm having trouble reformatting this code into the abstractions of np.fromiter, np.vectorize, or np.frompyfunc and after a few hours last night I'm hoping someone more familiar with these can enlighten me as to how I could use one or more of these with my code.

I'd prefer to pass in the row and the difference sign separately (as lambda row, diff: ...), but I couldn't manage to get these to work, so I hstacked the v_adj and d arrays so that each row would hold both of them (and I would only need one argument to the lambda).

Here are the two versions of the function:

def interpolate_1d_vectorised(v, step=0.2):
    """
    Couldn't get this to work: how to expand out the two parts at a time to pass to
    the lambda function?
    """
    v_adj = adjacent_pairs(v)
    d = np.diff(v_adj) / np.abs(np.diff(v_adj))
    # lambda_func = lambda row, diff: np.arange(*row, diff * step)
    lambda_func = lambda row, diff: np.arange(row[0], row[1], diff * step)
    row_arange = np.vectorize(lambda_func, signature="(),()->()")
    interpolated = row_arange(v_adj, d)
    return interpolated


def interpolate_1d_vectorised_triples(v, step=0.2):
    v_adj = adjacent_pairs(v)
    d = np.diff(v_adj) / np.abs(np.diff(v_adj))
    triples = np.hstack([v_adj, d])
    triple_lambda = lambda t: np.arange(t[0], t[1], t[2] * step)
    row_arange_t = np.vectorize(triple_lambda, signature="()->()")
    interpolated = row_arange_t(triples)
    return interpolated

Some sample errors I got:

ValueError: setting an array element with a sequence.
- from row_arange(v_adj, d) where row_arange = np.vectorize(lambda_func, signature="(),()->()") (as in interpolate_1d_vectorised)
- also from np.fromiter([np.arange(a,b,c * step) for (a,b,c) in triples])

I tried debugging with a lambda function that just prints out the values it's working on, and it seems that the vectorising happens over every value in the array, rather than every row (which is what I'd like). This seems to explain the error message, but I'm still unclear on how to take three values at a time (or a row at a time) as input into the vectorised function and produce one output per that input.

I've used np.apply_along_axis and np.apply_over_axes before but I was getting various errors using these too.

I expected this to work:

triple_lambda = lambda t: np.arange(t[0], t[1], t[2] * 0.2)
np.apply_along_axis(triple_lambda, 1, triples)

but it gave: ValueError: could not broadcast input array from shape (16) into shape (40), which I assume means that the interpolated values make the vector larger.

np.apply_over_axes(triple_lambda, triples, axes=[0,2]) gave TypeError: () takes 1 positional argument but 2 were given (same when axes=[0,1]).

(This was about the point I gave up)

Sorry if this isn't the right application to use these functions for, please let me know if there's something else better meant for this (and what if anything these functions would be used for instead). I was going to just delete these attempts and move on but thought I should ask here so I might learn how to use these functions in future. Any advice much appreciated!

Numpy vectorize for a lambda with multiple arguments

Answers (1)

Related Questions