Reputation: 45
I have a hilbert curve index based on this algorithm. I take two to four values (latitude, longitude, time in unix format and an id code) and create a 1-d hilbert curve.
I'm looking for a way to use this data to create a bounding box query (i.e. "find all ids within this rectangle).
I'm looking for a way to do so without decoding the 1d Hilbert code back into its constituent parts. It seems to be easier to do this with a Morton/Z-order curve but I was wondering about the locality preservation.
My question is: if I created a 2d hilbert curve range (i.e. I converted the range of the box into a hilbert curve so x1y1-> hilbert value1 and x2y2-> hilbertvalue2) would all values of corresponding 2d hilbert values fall within their range?
E.g. If I converted (1,2) and (20,30) into Hilbert values and then searched for all values between hilbertvalue1 and hilbertvalue2, would all the values I get decode to fall within (1,2) and (20, 30), or would I have to perform additional transformations?
An additional problem is crafting a range when you have more than 2 dimensions. I have the ability to convert in and out of Hilbert curves but how can I make sure that even 4d values have latitude and longitude that falls within the same rectangle/bounding box?
Thanks.
Upvotes: 1
Views: 1528
Reputation: 12097
My question is: if I created a 2d hilbert curve range (i.e. I converted the range of the box into a hilbert curve so x1y1-> hilbert value1 and x2y2-> hilbertvalue2) would all values of corresponding 2d hilbert values fall within their range?
The answer is no. This is part of the challenge of using a Hilbert index. Below is an example curve. You'll notice that you can have a point on the curve that has a higher index than the vertices of a box containing that point. The light blue box is an example where the vertices have indices 117, 122, 133, 138 yet inside (though on the border) is the value 143.
One simple approach is brute force where you visit every cell in the search region and calculate the index in those cells. Then you compile a list of index ranges that would be used in a query. You might join some ranges and filter later as a performance optimization based on benchmarks (lots of small range queries might take longer than querying fewer larger ranges followed by a filter). I'd like to see something more elegant than this but have yet to see it.
UPDATE: I've worked out something more elegant than the brute force technique and the details (and a java library) are at https://github.com/davidmoten/hilbert-curve. In short, the endpoints of the ranges that exactly cover a search box will all be on the perimeter of the region. If you sort all the hilbert curve values on the perimeter of the region and start with the smallest value you can then pair up all the ranges by doing tests on whether the next point on the curve stays on the perimeter, leaves the box or is inside the box.
An additional problem is crafting a range when you have more than 2 dimensions. I have the ability to convert in and out of Hilbert curves but how can I make sure that even 4d values have latitude and longitude that falls within the same rectangle/bounding box?
The perimeter technique described above works for any number of dimensions (but of course becomes more expensive!).
Upvotes: 3
Reputation: 12592
For 2 dimensions you can treat the curve as a base-4 number (quadkey) and search from left to right.
Upvotes: 0