NumPy - Trouble vectorizing method

Question

I have three arrays, a, b and c. Shapes are (N, 2), (N, 3), (N, 3) respectively.

I need to compare elements per row in b and update an index in the same row on a. I thought I had understood how to vectorize this method, but I think my dimensions are incorrect.

What I currently have is:

def to_cube(points):
    cube = np.empty((len(points), 3), dtype=np.half)
    delta = np.empty_like(cube)

    q = ((2 / 3) * points[:, 0]) / 0.1
    r = (((-1 / 3) * points[:, 0]) + ((np.sqrt(3) / 3) * points[:, 1])) / 0.1

    cube[:, 0] = np.round(q)
    cube[:, 1] = np.round(-q-r)
    cube[:, 2] = np.round(r)
    delta[:, 0] = np.abs(cube[:, 0] - q)
    delta[:, 1] = np.abs(cube[:, 1] - (-q-r))
    delta[:, 2] = np.abs(cube[:, 2] - r)

    if delta[:, 0] > delta[:, 1] and delta[:, 1] > delta[:, 2]:
        cube[:, 0] = -cube[:, 1] - cube[:, 2]
    elif delta[:, 1] > delta[:, 2]:
        cube[:, 1] = -cube[:, 0] - cube[:, 2]
    else:
        cube[:, 2] = -cube[:, 0] - cube[:, 1]

    return cube

This throws a ValueError: The truth value of an array with more than one element is ambiguous.

After looking at the conditionals, its clear that the first check of delta[:, 0] > delta[:, 1] will return an array of shape (N, 1). How do I change this to go for each row in a, grab the appropriate indices on that row, then update the same row in b for a specific index based on conditionals?

Edit: sample

This samples assumes that q and r are done. These matrices represent cube and delta:

>>> cube
array([[275.0, -400.0, 124.0]], dtype=float16) # so this is a (1, 3) but could be (N, 3)
>>> cube[0]
array([275.0, -400.0, 124.0], dtype=float16)
>>> delta
array([[5., 10., 3.]], dtype=float16)
>>> delta[0]
array([5., 10., 3.], dtype=float16)

Now execute through the conditionals (values are sub'd in):

if 5.0 > 10.0 and 10.0 > 3.0:
    cube[0] = -(-400.0) - 124.0
elif 10.0 > 3.0:
    cube[1] = -(275.0) - 124.0
else:
    cube[2] = -(275.0) - (-400.0)

return cube # array([275.0, -(275.0) - 124.0, 124.0], dtype=float16)

This shows what happens per row, now I need to do it for all rows.

Edit: potential solution (is it vectorized?)

There is a way to ensure the rows are accessed independently using a for-range:

def to_cube(points):
    cube = np.empty((len(points), 3), dtype=np.half)
    delta = np.empty_like(cube)

    q = ((2 / 3) * points[:, 0]) / 0.1
    r = (((-1 / 3) * points[:, 0]) + ((np.sqrt(3) / 3) * points[:, 1])) / 0.1

    cube[:, 0] = np.round(q)
    cube[:, 1] = np.round(-q-r)
    cube[:, 2] = np.round(r)
    delta[:, 0] = np.abs(cube[:, 0] - q)
    delta[:, 1] = np.abs(cube[:, 1] - (-q-r))
    delta[:, 2] = np.abs(cube[:, 2] - r)

    for i in range(len(cube)):
        if delta[i, 0] > delta[i, 1] and delta[i, 1] > delta[i, 2]:
            cube[i, 0] = -cube[i, 1] - cube[i, 2]
        elif delta[i, 1] > delta[i, 2]:
            cube[i, 1] = -cube[i, 0] - cube[i, 2]
        else:
            cube[i, 2] = -cube[i, 0] - cube[i, 1]

    return cube

However, I am now "looping over" the arrays, doesn't seem vectorized or broadcasted.

pstatix · Accepted Answer

To anybody interested, this is how I solved the problem:

def to_cube(points):
    cube = np.empty((len(points), 3), dtype=np.half)
    delta = np.empty_like(cube)

    q = ((2 / 3) * points[:, 0]) / 0.1
    r = (((-1 / 3) * points[:, 0]) + ((np.sqrt(3) / 3) * points[:, 1])) / 0.1

    cube[:, 0] = np.round(q)
    cube[:, 1] = np.round(-q-r)
    cube[:, 2] = np.round(r)
    delta[:, 0] = np.abs(cube[:, 0] - q)
    delta[:, 1] = np.abs(cube[:, 1] - (-q-r))
    delta[:, 2] = np.abs(cube[:, 2] - r)
    
    # define boolean arrays for where conditions exist
    rxc = ((delta[:, 0] > delta[:, 1]) & (delta[:, 1] > delta[:, 2]))
    ryc = (delta[:, 1] > delta[:, 2])
    rzc = ~(rxc + ryc)

    # update just those indices by condition
    cube[rxc, 0] = -cube[rxc, 1] - cube[rxc, 2]
    cube[ryc, 1] = -cube[ryc, 0] - cube[ryc, 2]
    cube[rzc, 2] = -cube[rzc, 0] - cube[rzc, 1]

    return cube

If anybody sees room for improvement's optimizations, I'd love to know!

A benchmark on my system:

import numpy as np
from timeit import timeit

u = np.random.uniform

points = np.array([[u(0, 50), u(0, 50)] for _ in range(37000000)], dtype=np.half)

p = 'from __main__ import points, to_cube; to_cube(points)'

timeit(p, number=1)

# output: 17.94858811999

NumPy - Trouble vectorizing method

Answers (1)

Related Questions