TilmannZ
TilmannZ

Reputation: 1889

Index implementation

I'm trying to implement rectangular range queries for my PH-tree index, I have the following questions:

  1. Looking at existing implementations, all stored data appears to be point-data (as opposed to rectangles/cuboids/... defined by a lower left or upper right corner). Is that true? Or how could I tell from the Relation whether I'm storing points and rectangles and from where could I get the upper-left corner of the rectangle?
  2. Is there a query type that simply returns all points that lie in a rectangle (or returns all rectangles that intersect with a given query rectangle)? I looked at RangeQuery, but from the documentation it appears to return the nearest neighbours of a given range. Similarily, the other implementations of DatabaseQuery don't seem to support this standard query.
  3. Is there a way I can get the existing tests to verify my implementation? Is it sufficient to implement an IndexFactory with the @apiviz annotations?
  4. Maybe a bit off topic: I couldn't find an ELKI mailing list. The website mentions a "user mailing list" for updates and news, but registration from outside LMU is blocked. The site also mentions a community mailing list, but I couldn't find a link, could someone post it here?

Upvotes: 1

Views: 44

Answers (1)

Erich Schubert
Erich Schubert

Reputation: 8715

  1. Relations in ELKI have type information.

    If the type is a NumberVector, then it is point data. We don't have many use cases yet for storing rectangles, but you could write your index so it can work with e.g. SpatialComparable (which essentially is any kind of bounding box).

  2. There is currently no query type for rectangle window queries yet, but these can be emulated using the center and a weighted Maximum norm. There may be only one or two data mining algorithms in ELKI that use rectangular queries. Most data mining algorithms that can be accelerated using indexes appear to either use radius, or kNN search.

  3. The best way for testing is to support standard range and knn queries, and then run e.g. DBSCAN clustering and LOF outlier detection. If you implement a IndexFactory and a Parameterizer for it (so it can be configured in the MiniGUI) then this should be easy to test.

    @apiviz annotations are solely used for JavaDoc, and we have been considering to move to a different tool for UML diagrams.

  4. Sorry, the mailing lists registration pages are currently not accessible; the system admins appear to have an unresolved security concern. I have updated the web page with instructions how to subscribe by email.

Upvotes: 1

Related Questions