Reputation: 63
I am trying to label x and y points based on their being in a specific section of a meshgrid in python. The points are stored in a pandas dataframe.
Here I have a scatter plot of the coordinates and above them I am plotting the grid. The entire grid is way bigger, from the bottom left point (500,1250) to upper right point (2750, 3250), which means the whole grid is 225x200 sections.
I want to iterate through the sections of the grid and check if a point is inside. If a point is inside the section I want to add a label to the point. The label should be the same of the section name. I want to add a column to the dataframe called 'section' that stores the section a point belongs to.
In the example (picture above) I would like to label all the points with 770 <= x <= 780 and 1795 <= y <= 1805 with the section name 'A3'.
my code currently looks like this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
df = pd.read_csv('./file.csv', sep=';')
x_min = df['X[mm]'].min()
x_max = df['X[mm]'].max()
y_min = df['Y[mm]'].min()
y_max = df['Y[mm]'].max()
#side of the square in mm:
square_side = 10
xs = np.arange(x_min, x_max+square_side, square_side)
ys = np.arange(y_min, y_max+square_side, square_side)
x_2, y_2 = np.meshgrid(xs, ys, indexing = 'ij')
fig, ax = plt.subplots(figsize=(9,9))
ax.plot(df['X[mm]'], df['Y[mm]'], linewidth=0.2, c='black')
#plot meshgrid as grid instead of points:
segs1 = np.stack((x_2[:,[0,-1]],y_2[:,[0,-1]]), axis=2)
segs2 = np.stack((x_2[[0,-1],:].T,y_2[[0,-1],:].T), axis=2)
plt.gca().add_collection(LineCollection(np.concatenate((segs1, segs2))))
ax.set_aspect('equal', 'box')
plt.show()
I have also a function that determines if the points are inside of a rectangle (this does not use meshgrid):
def is_inside_rect(M, A, B, D):
'''Check if a point M is inside a rectangle with corners A, B, C, D'''
# 0 <= dot(BC,BM) <= dot(BC,BC)
#print(np.dot(B - A, D - A))
return 0 <= np.dot(B - A, M - A) <= np.dot(B - A, B - A) and 0 <= np.dot(D - B, M - B) <= np.dot(D - B, D - B)
I thought of using it in a while loop like this:
x = x_min
y = y_min
while (x <= x_max + square_side) and (y <= y_max + square_side):
A = np.array([x, y])
B = np.array([x + square_side, y])
D = np.array([x + square_side, y + square_side])
print(A, B, D)
df['c'] = df[['X[mm]', 'Y[mm]']].apply(lambda coord: 'red' if is_inside_rect(np.array(coord), A, B, D) else 'black', axis=1)
x += square_side
y += square_side
but this very slow and it changes the colors of all the points in every iteration.
Upvotes: 3
Views: 658
Reputation: 1191
Since all your points are equally sized, there is no need to define all of your squares beforehand and then determine which squares have which points. I would use the coordinates of each point to directly determine which square it will land in.
Let's take the 1-dimensional case, for the sake of simplicity. You want to group points on the number line into "squares" (really 1-d line segments). If your first square starts at x=0, your second at x=10, your third at x=20, and so on, how do you find the square for an arbitrary point x? You know that your squares are spaced by 10 (and you know they start at 0, which makes things easier), so you can simply divide by 10 and round down to get the square index.
You can just as easily do the same thing in 3-dimensions (or n-dimensions).
square_side = 10
x_min = df['X[mm]'].min()
y_min = df['Y[mm]'].min()
def label_point(x, y):
# Double forward slash is integer (round down) division
# Add 1 here if you really want 1-based indexing
x_label = (x - x_min) // square_side
y_label = chr(ord('A') + (y - y_min) // square_side)
return f'{y_label}{x_label}'
df['label'] = df[['X[mm]', 'Y[mm]']].apply(lambda coord: label_point(*coord), axis=1)
As for the efficiency, this solution looks at each point only once, and does a constant amount of work with each point, so it is O(n) in the number of points. Your solution looks at each square once, and for each square looks at each point this is O(n × m) where n is the number of points and m is the number of squares.
Your solution is more general, in that your is_inside_rect
function works when your grid of rectangles has an arbitrary rotation. In this case, I would recommend rotating all your points about the origin, and then running my solution.
Also, your loop is adding 10 to x and y every loop, so you are traversing your space diagonally. I don't think you meant to do that.
Upvotes: 2