Reputation: 7415
I have a range of segments that at most intersect with each other at their ends. I want to merge these segments into polylines.
Is there an algorithm that does this in O(N_segments)
without using extra storage (e.g. without having to build a tree or some other spatial data-structure for the segment points and working on that)?
The number of segments I have is small, O(10). So putting them into a dynamic data-structure like a hash-table or a map (red-black tree) would probably be more expensive than the O(N^2)
loop at the end unless I put that data-structure on the stack and avoid any memory allocations (The polylines I am using are implemented using a small_vector
, which avoids allocations as long as the number of points is small enough.
Currently I've come up with this:
polylines = []
// 1. Append each segment to a range of polylines, merging when possible:
for each segment in segments:
for each polyline in polylines:
merge_result = merge(segment, polyline)
if (not merge_result) continue
polyline = merge_result
goto done
// either polylines empty or no merge possible
polylines.append(segment)
done:
continue
// 2. Try to merge the polylines among themselves until no more merges are possible
// Pure brute force, quadratic
done = false
while not done:
for p1 in polylines:
for p2 in polylines[p1..end]:
merge_result = merge(p1, p2)
if not merge: continue
p1 = merge_result
polylines.remove(p2)
done = false
goto restart
restart: continue
But the second loop is clearly quadratic, so I wonder if there is a better algorithm to merge/join/combine a sequence of segments among themselves.
Upvotes: 1
Views: 1560
Reputation: 1435
Your problem is equivalent to finding duplicates in an array. This problem can't be solved in O(N) time and 0(1) space in general case. You can either use sorting, as suggested, to have O(N log N) complexity, or use a data structure. For finding duplicates in general case you can look at this discussion. For a special case, when an array of size n contains elements in range 0,...n-1 there is a O(N) time and 0(1) space solution, which which makes use of the fact that elements can be used as indices, see this post.
However, if you are anyway talking only about 10 elements, even a quadratic loop won't hurt much. If time is really crucial, you should in any case benchmark both methods, rather than guessing. The problem is that nobody can tell you which one will be faster on your machine for just 10 elements, since pure complexity classes become important only for large N. For small N, O(N^2) algorithm can much faster than O(N log n) algorithm. Also, apart from memory allocation, cache efficiency and whatever else comes into play. So, my suggestion: either benchmark if you really care about speed, or don't bother if you don't.
Upvotes: 0
Reputation: 2623
I seriously doubt that an O(n) method can exist.
Here is a O(n log(n)) method that detects the segment extremities that have exactly the same coordinates. It uses a "data structure", but it is a very simple one (just a vector):
1) create a vector of elements (x,y,i) of all extremities of all segments, where x,y denote the coordinates of the extremity, and i the index of the extremity (for instance, 2*segment index and 2*segment index + 1 for the two extremities of a segment).
2) sort the vector with the lexicographic order on (x,y)
3) scan the vector, the points with exactly the same coordinates are contiguous in the vector (and with the index i you can retreive the segment extremities they correspond to)
I am using this algorithm for merging vertices in 3D meshes, it is simple and very fast, much much faster than if using a hash map or a set (detects duplicates in pointsets as large as 10 million points within seconds).
Upvotes: 2