Python vectorized running max of parallel line segments

Question

I have a large number of independent, parallel horizontal line segments in a numpy array. Each segment has a start and an end (x-coordinates), and a value (y-coordinate). The segments don't necessarily have the same length (length = end - start).

An example matrix specifying the segments, one segment per row, could look as follows:

Start End Value
0     10  4
5     19  3
6     25  2
7     16  1
12    21  5

In code

A = np.array([[0,10,4],
[5,19,3],
[6,25,2],
[7,16,1],
[12,21,5]])

I want to figure out the running max over the line segments. That is, in the above example, for x in the range [0,25), I want the corresponding max y. Example output corresponding to the example would be

Start End Max
0     10  4
10    12  3
12    21  5
21    25  2

I can do this in a for loop, but this is slow, since I have tens of thousands of segments. I cannot seem to think of a way to vectorize this. Can anyone?

Example for loop code:

x = np.arange(np.min(A[:,0]), np.max(A[:,1]))
maxes = np.zeros((x.shape[0], 2))
maxes[:,0] = x
maxes[:,1] = -np.inf

for a in A:
    ix = (x >= a[0]) & (x < a[1]) & (maxes[:,1] < a[2])
    maxes[ix,1] = a[2]

This code outputs an array with a row for every x in the range, contrary to the output example above. Both are fine (and equivalent).

Elliot · Accepted Answer

You can use a boolean array to determine if a given point in the space is in a given line segment. That boolean array can be multiplied with the segment values to generate an array where each point on the line has a vector of segment values, and if a segment doesn't include the point, the value is of that segment is zeroed out. From there array's max method can be applied along a single axis.

import numpy as np

A = np.array([[0,10,4],
[5,19,3],
[6,25,2],
[7,16,1],
[12,21,5]])

# get the dimension of the space
seg_left = A[:, 0, None]
seg_right = A[:, 1, None]
seg_val = A[:, 2, None]

# set the left edge of the space and reset the axes
left_edge = seg_left.min()
seg_left -= left_edge
seg_right -= left_edge
right_edge = seg_right.max()


# generate an array of coordinates and repeat it for each defined segment. This 
# can then be used to determine what segments are on for each point
space = np.tile(np.arange(right_edge+1), (seg_val.size, 1))
space_bool = np.logical_and(space >= seg_left,
                            space < seg_right)

# find the maximum of the on segments
seg_max = (seg_val * space_bool).max(axis=0)

# determine the continuous segments. The +1 ensures that the correct value is
# selected
steps = np.r_[0, np.where(np.diff(seg_max))[0]+1]
seg_val = seg_max[steps[:-1]]

# reset the left edge to the original left edge
steps += left_edge

print(np.c_[steps[:-1], steps[1:], seg_val])

# [[ 0 10  4]
#  [10 12  3]
#  [12 21  5]
#  [21 25  2]]

Python vectorized running max of parallel line segments

Answers (2)

Related Questions