Matt
Matt

Reputation: 392

Python vectorized running max of parallel line segments

I have a large number of independent, parallel horizontal line segments in a numpy array. Each segment has a start and an end (x-coordinates), and a value (y-coordinate). The segments don't necessarily have the same length (length = end - start).

An example matrix specifying the segments, one segment per row, could look as follows:

Start End Value
0     10  4
5     19  3
6     25  2
7     16  1
12    21  5

In code

A = np.array([[0,10,4],
[5,19,3],
[6,25,2],
[7,16,1],
[12,21,5]])

I want to figure out the running max over the line segments. That is, in the above example, for x in the range [0,25), I want the corresponding max y. Example output corresponding to the example would be

Start End Max
0     10  4
10    12  3
12    21  5
21    25  2

I can do this in a for loop, but this is slow, since I have tens of thousands of segments. I cannot seem to think of a way to vectorize this. Can anyone?

Example for loop code:

x = np.arange(np.min(A[:,0]), np.max(A[:,1]))
maxes = np.zeros((x.shape[0], 2))
maxes[:,0] = x
maxes[:,1] = -np.inf

for a in A:
    ix = (x >= a[0]) & (x < a[1]) & (maxes[:,1] < a[2])
    maxes[ix,1] = a[2]

This code outputs an array with a row for every x in the range, contrary to the output example above. Both are fine (and equivalent).

Upvotes: 2

Views: 387

Answers (2)

Elliot
Elliot

Reputation: 2690

You can use a boolean array to determine if a given point in the space is in a given line segment. That boolean array can be multiplied with the segment values to generate an array where each point on the line has a vector of segment values, and if a segment doesn't include the point, the value is of that segment is zeroed out. From there array's max method can be applied along a single axis.

import numpy as np

A = np.array([[0,10,4],
[5,19,3],
[6,25,2],
[7,16,1],
[12,21,5]])

# get the dimension of the space
seg_left = A[:, 0, None]
seg_right = A[:, 1, None]
seg_val = A[:, 2, None]

# set the left edge of the space and reset the axes
left_edge = seg_left.min()
seg_left -= left_edge
seg_right -= left_edge
right_edge = seg_right.max()


# generate an array of coordinates and repeat it for each defined segment. This 
# can then be used to determine what segments are on for each point
space = np.tile(np.arange(right_edge+1), (seg_val.size, 1))
space_bool = np.logical_and(space >= seg_left,
                            space < seg_right)

# find the maximum of the on segments
seg_max = (seg_val * space_bool).max(axis=0)

# determine the continuous segments. The +1 ensures that the correct value is
# selected
steps = np.r_[0, np.where(np.diff(seg_max))[0]+1]
seg_val = seg_max[steps[:-1]]

# reset the left edge to the original left edge
steps += left_edge

print(np.c_[steps[:-1], steps[1:], seg_val])

# [[ 0 10  4]
#  [10 12  3]
#  [12 21  5]
#  [21 25  2]]

Upvotes: 1

Thomas K&#252;hn
Thomas K&#252;hn

Reputation: 9810

You can use arrays of booleans for indexing of arrays. This means that you can check all your coordinates against your conditions at once and then index the value column (A[2]) with the result. From your example results I take it that the end points of the line segments should not be included, hence the following code:

import numpy as np

A = np.array(
    [[0,10,4],
     [5,19,3],
     [6,25,2],
     [7,16,1],
     [12,21,5]]
)

ranges = np.array([
    [0,10], [10,12], [12,21], [21,25]
])

for xmin,xmax in ranges:
    print(xmin,xmax, np.max(A[~np.logical_or(A[:,1]<=xmin, A[:,0]>=xmax),2]))

reproduces your desired result:

0 10 4
10 12 3
12 21 5
21 25 2

Upvotes: 1

Related Questions