Reputation: 50
This seems simple, but I would like to do this as efficiently as possible.
Effectively, I am interested in the points that are highest, lowest, most right, and most left.
Given an Array like [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]
I have done this
max_x = max([p[0] for p in pts])
min_x = min([p[0] for p in pts])
max_y = max([p[1] for p in pts])
min_y = min([p[1] for p in pts])
But I don't just need the max_x
value though. I need the entire point, and I would rather not iterate though the list more than needed (for the sake of speed with large input).
Bonus points if it's general for N-dimensional points (highest and lowest in each dimension).
Upvotes: 1
Views: 2736
Reputation: 11171
Numpy was designed for this kind of problem. It is a performant implementation of multi-dimensional numerical arrays (nested lists of regular shape):
import numpy as np
pts = [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]
arr = np.array(pts)
max_idx = np.argmax(arr, axis=0)
min_idx = np.argmin(arr, axis=0)
max_x, max_y = arr[max_idx]
min_x, min_y = arr[min_idx]
output:
# max_x, max_y, min_x, min_y
array([10, 2])
array([ 1, 10])
array([0, 2])
array([1, 0])
Performance comparison of lists vs arrays for large N
from random import random
N = int(1e7) # 10m points
def list_version(N):
pts = [[random(), random()] for j in range(N)]
max_x = max(pts, key = lambda x: x[0])
max_y = max(pts, key = lambda x: x[1])
min_x = min(pts, key = lambda x: x[0])
min_y = min(pts, key = lambda x: x[1])
return max_x, min_x, max_y, min_y
def arr_version(N):
arr = np.random.random(size=(N,2))
max_idx = np.argmax(arr, axis=0)
min_idx = np.argmin(arr, axis=0)
max_x, max_y = arr[max_idx]
min_x, min_y = arr[min_idx]
return max_x, min_x, max_y, min_y
%timeit list_version(N)
4.62 s ± 25.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit arr_version(N)
269 ms ± 2.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Upvotes: 2
Reputation: 2128
max
and min
take a key
argument where you can specify which value to be used for comparison.
Use:
max_x = max(pts, key = lambda x: x[0])
min_x = min(pts, key = lambda x: x[0])
max_y = max(pts, key = lambda x: x[1])
min_y = min(pts, key = lambda x: x[1])
Output:
>>> print(max_x, min_x, max_y, min_y)
[10, 2] [0, 2] [1, 10] [1, 0]
Upvotes: 1
Reputation: 6090
l = [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]
print (list(zip(*[(min(c), max(c)) for c in zip(*l)])))
Output:
[(0, 0), (10, 10)] # [(minx,miny),(maxx,maxy)]
Or just:
print ([(min(c), max(c)) for c in zip(*l)])
Output:
[(3, 10), (2, 9)] # [(minx,maxx),(miny,maxy)]
Upvotes: 0
Reputation: 780818
Use the key
argument so you can return the whole point while comparing just one item.
from operator import itemgetter
max_x = max(p, key=itemgetter(0))
min_x = min(p, key=itemgetter(0))
max_y = max(p, key=itemgetter(1))
min_y = min(p, key=itemgetter(1))
You could also do this with a single loop, instead of calling min()
and max()
.
min_x = max_x = min_y = max_y = p[0]
for point in p[1:]:
if point[0] < min_x[0]:
min_x = point
if point[0] > max_x[0]:
max_x = point
if point[1] < min_y[1]:
min_y = point
if point[1] > max_y[1]:
max_y = point
Upvotes: 2