Reputation: 17544
I have some 2D data which contains edges which were rasterized into pixels. I want to implement an efficient data structure which returns all edge pixels which lie in a non-axis-aligned 2D triangle.
The image shows a visualization of the problem where white denotes the rasterized edges, and red visualizes the query triangle. The result would be all white pixels which lie on the boundary or inside the red triangle.
When further looking at the image, one notices that we have sparse boolean data, meaning that if we denote black pixels with a 0 and white pixels with a 1, that the number of 1s in the data is much lower than the number of 0s. Therefore, rasterizing the red triangle and checking for each point on it's inside whether it is white or black is not the most efficient approach.
Besides the sparseness of the data; since the white pixels origin from edges, it is in their nature to be connected together. However, at junctions with other lines, they have more than two neighbors. Pixels which are at a junction should only be returned once.
The data must be processed in realtime, but with no GPU assistance. There will be multiple queries for different triangle contents, and after each one, points may be removed from the data structure. However, new points won't be inserted anymore after the initial filling of the data structure.
The query triangles are already known when the rasterized edges arrive.
There are more query triangles than data edges.
There are many spatial data structures available. However I'm wondering, which one is the best one for my problem. I'm willing to implement a highly optimized data structure to solve this problem, as it will be a core element of the project. Therefore, also mixes or abbreviations of data structures are welcome!
R-trees seem to be the best data structure which I found for this problem until now as they provide support for rectangle-based queries. I would check for all white pixels within an AABB of the query triangle, then would check for each returned pixel if it lies within the query rectangle.
However, I'm not sure how well R-trees will behave since edge-based data will not be easily groupable into rectangles, as the points are clumped together on narrow lines and not pread out.
I'm alo not sure if it would make sense to pre-build the structure of the R-tree using information about the query triangles which will be made as soon as the structure is filled (as mentioned before, the query triangles are already known when the data arrives).
Reversing the problem seems also to be a valid solution, where I use a 2-dimensional interval tree to get for each white pixel a list of all triangles which contain it. Then, it can already be stored within all those result sets and be returned instantly when the query arrives. However, I'm not sure how this performs a the number of triangles is higher than the number of edges, but still lower than the number of white pixels (as an edge is mostly split up into ~20-50 pixels).
A data structure which would exploit that white pixels have most often white pixels as neighbors would seem to be most efficient. However, I could not find anything about such a thing until now.
Upvotes: 3
Views: 2207
Reputation: 2624
There are a couple computational-geometric algorithms that I think in tandem would give good results.
Compute a planar subdivision that contains all of the triangle edges. (This is a little more complicated than computing all intersections of triangle edges.) For each face, make a list of the triangles that contain that face. This is admittedly worst-case cubic, but that's only when the triangles overlap a lot (and I can't help but think that there's a way to compress it to quadratic).
Locate each pixel in the subdivision (i.e., figure out which face it belongs to). The first one in each edge will cost O(log n), but if you have locality thereafter, there may be a way to shortcut the computation to something like O(1) on average. (For example, if you use the trapezoid method and if you store the list of trapezoids that contained the last point, you can traverse up the list until you find a trapezoid that contains the current point and work back down. Compare giving hints to C++ STL set insertion by passing an iterator near the insertion point.)
Upvotes: 0
Reputation: 44240
Decompose the query triangle(s) into n*3 lines. For every point under test you can estimate at which side of every line it is. The rest is boolean logic.
EDIT: since your points are rasterised, you could precompute the points on the scanlines where the scanline enters or leaves a particular query triangle (=crosses one of the 3n lines above && is on the "inside" of the other two lines that participate in that particular triangle)
UPDATE: Triggered by another topic ( How can I find out if point is within a triangle in 3D? ) I'll add code to prove that a non-convex case can be expressed en terms of "which side of every line a point is on". Since I am lazy, I'll use an L-shaped form. IMHO other Non-convex shapes can be processed similarly. The lines are parallel to the X- and Y- axes, but that again is laziness.
/*
Y
| +-+
| | |
| | +-+
| | |
| +---+
|
0------ X
the line pieces:
Horizontal:
(x0,y0) - (x2,y0)
(x1,y1) - (x2,y1)
(x0,y2) - (x1,y2)
Vertical:
(x0,y0) - (x0,y2)
(x1,y1) - (x1,y2)
(x2,y0) - (x2,y1)
The lines:
(x==x0)
(x==x1)
(x==x2)
(y==y0)
(y==y1)
(x==y2)
Combine them:
**/
#define x0 2
#define x1 4
#define x2 6
#define y0 2
#define y1 4
#define y2 6
#include <stdio.h>
int inside(int x, int y)
{
switch( (x<x0 ?0:1)
+(x<x1 ?0:2)
+(x<x2 ?0:4)
+(y<y0 ?0:8)
+(y<y1 ?0:16)
+(y<y2 ?0:32) ) {
case 1+8:
case 1+2+8:
case 1+8+16:
return 1;
default: return 0;
}
}
int main(void)
{
int xx,yy,res;
while (1) {
res = scanf("%d %d", &xx, &yy);
if (res < 2) continue;
res = inside(xx, yy);
printf("(%d,%d) := %d\n", xx, yy,res);
}
return 0;
}
Upvotes: 1