Reputation: 132
I am using ELKI for clustering and I tried it more than 1k times on many datasets and it was fine :D but when i run it on one of my files (it was the big one) I see an error in initializing tree. whats the problem? how can I fix it?
the whole command and result is here:
java -jar elki-bundle-0.7.1.jar KDDCLIApplication -verbose -verbose -enableDebug true -dbc.in my_input -parser.labelIndices 0 -db.index tree.spatial.rstarvariants.rstar.RStarTreeFactory -time -algorithm clustering.DBSCAN -algorithm.distancefunction geo.LngLatDistanceFunction -geo.model SphericalHaversineEarthModel -dbscan.epsilon 50.0 -dbscan.minpts 446 -resulthandler ResultWriter,ExportVisualizations -out my_output -vis.output my_visOutput
de.lmu.ifi.dbs.elki.datasource.FileBasedDatabaseConnection.load: 5716 ms de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.directory.capacity: 95 de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.directory.minfill: 38 de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.leaf.capacity: 153 de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.leaf.minfill: 61 Node is not a directory node! java.lang.UnsupportedOperationException: Node is not a directory node! at de.lmu.ifi.dbs.elki.index.tree.AbstractNode.addDirectoryEntry(AbstractNode.java:240) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.insertDirectoryEntry(AbstractRStarTree.java:194) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.reInsert(AbstractRStarTree.java:655) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.overflow.LimitedReinsertOverflowTreatment.handleOverflow(LimitedReinsertOverflowTreatment.java:97) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.overflowTreatment(AbstractRStarTree.java:571) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.adjustTree(AbstractRStarTree.java:676) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.adjustTree(AbstractRStarTree.java:705) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.insertLeafEntry(AbstractRStarTree.java:175) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.reInsert(AbstractRStarTree.java:649) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.overflow.LimitedReinsertOverflowTreatment.handleOverflow(LimitedReinsertOverflowTreatment.java:97) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.overflowTreatment(AbstractRStarTree.java:571) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.adjustTree(AbstractRStarTree.java:676) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.insertLeafEntry(AbstractRStarTree.java:175) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.AbstractRStarTree.insertLeaf(AbstractRStarTree.java:151) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.insert(RStarTreeIndex.java:104) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.insertAll(RStarTreeIndex.java:129) at de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeIndex.initialize(RStarTreeIndex.java:94) at de.lmu.ifi.dbs.elki.database.StaticArrayDatabase.initialize(StaticArrayDatabase.java:168) at de.lmu.ifi.dbs.elki.workflow.InputStep.getDatabase(InputStep.java:63) at de.lmu.ifi.dbs.elki.KDDTask.run(KDDTask.java:108) at de.lmu.ifi.dbs.elki.application.KDDCLIApplication.run(KDDCLIApplication.java:61) at de.lmu.ifi.dbs.elki.application.AbstractApplication.runCLIApplication(AbstractApplication.java:194) at de.lmu.ifi.dbs.elki.application.KDDCLIApplication.main(KDDCLIApplication.java:96) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at de.lmu.ifi.dbs.elki.application.ELKILauncher.main(ELKILauncher.java:60)
Upvotes: 0
Views: 46
Reputation: 8715
Without the data it is hardly reproducable, unfortunately.
Apparently there is some inconsistency in the tree, where the tree is no longer correctly balanced. This will require quite some effort to debug, so it would of course be very nice if you could find this bug. From a quick look at the stack trace it seems that a reinsert caused a second reinsert (IIRC, R*-trees allows multiple reinserts at the same time, but only at different levels; this may indeed be a different level). But the reinserts are really tricky to get right. In fact, I have a faster and cleaner implementation of the R-trees for ELKI somewhere, but I never got to do a clean reinsert; so they don't have all the current functionality (and that is why the rewrite never got merged into ELKI).
As a simple work-around (and that is recommended anyway), bulk-load your tree with -spatial.bulkstrategy SortTileRecursiveBulkSplit
instead. Then this code path will never be used (plus, the tree will be better anyway, this will be significantly faster, as the "global" bulk load produces non-overlapping leaf pages, and low-overlap trees with optimum fill).
This is recommended in the indexing documentation, because of the better performance.
Upvotes: 1