Reputation: 419
I have a long list of H-points
with known coordinates. I have also a list of TP-points
. I'd like to know if the H-points
fall within any(!) TP-point
with certain radius (e.g. r=5
).
dfPoints = pd.DataFrame({'H-points' : ['a','b','c','d','e'],
'Xh' :[10, 35, 52, 78, 9],
'Yh' : [15,5,11,20,10]})
dfTrafaPostaje = pd.DataFrame({'TP-points' : ['a','b','c','d','e'],
'Xt' :[15,25,35],
'Yt' : [15,25,35],
'M' : [5,2,3]})
def inside_circle(x, y, a, b, r):
return (x - a)*(x - a) + (y - b)*(y - b) < r*r
I've started but.. it would be much easier to check this for only one TP point. But if I have e.g. 1500 of them and 30.000 H-points, then i need more general solution. Can anyone help?
Upvotes: 0
Views: 622
Reputation: 150735
Another option is to use distance_matrix
from scipy.spatial
:
dist_mat = distance_matrix(dfPoints [['Xh','Yh']], dfTrafaPostaje [['Xt','Yt']])
dfPoints [np.min(dist_mat,axis=1)<5]
Took about 2s for 1500 dfPoints
and 30000 dfTrafaPostje
.
Update: to get the index of the reference points with highest score:
dist_mat = distance_matrix(dfPoints [['Xh','Yh']], dfTrafaPostaje [['Xt','Yt']])
# get the M scores of those within range
M_mat = pd.DataFrame(np.where(dist_mat <= 5, dfTrafaPosaje['M'].values[None, :], np.nan),
index=dfPoints['H-points'] ,
columns=dfTrafaPostaje['TP-points'])
# get the points with largest M values
# mask with np.nan for those outside range
dfPoints['M'] = np.where(M_mat.notnull().any(1), M_mat.idxmax(1), np.nan)
For the included sample data:
H-points Xh Yh TP
0 a 10 15 a
1 b 35 5 NaN
2 c 52 11 NaN
3 d 78 20 NaN
4 e 9 10 NaN
Upvotes: 2
Reputation: 61910
You could use cdist from scipy to compute the pairwise distances, then create a mask with True where distance is less than radius, and finally filter:
import pandas as pd
from scipy.spatial.distance import cdist
dfPoints = pd.DataFrame({'H-points': ['a', 'b', 'c', 'd', 'e'],
'Xh': [10, 35, 52, 78, 9],
'Yh': [15, 5, 11, 20, 10]})
dfTrafaPostaje = pd.DataFrame({'TP-points': ['a', 'b', 'c'],
'Xt': [15, 25, 35],
'Yt': [15, 25, 35]})
radius = 5
distances = cdist(dfPoints[['Xh', 'Yh']].values, dfTrafaPostaje[['Xt', 'Yt']].values, 'sqeuclidean')
mask = (distances <= radius*radius).sum(axis=1) > 0 # create mask
print(dfPoints[mask])
Output
H-points Xh Yh
0 a 10 15
Upvotes: 1