Kevin
Kevin

Reputation: 25269

solr spatial bad performance

I'm using SOLR-3.4, spatial filtering with the schema having LatLonType (subType=tdouble). I have an index of about 20M places. My basic problem is that if I do bbox filter with cache=true, the performance is reasonably good (~40-50 QPS, about 100-150ms latency), but a big downside is crazy fast old gen heap growth ultimately leading to major collections every 30-40 minutes (on a very large heap, 25GB). And at that point performance is beyond unacceptable. On the other hand I can turn off caching for bbox filters, but then my latency and QPS drops (the latency goes down from 100ms => 500ms). The NumericRangeQuery javadoc talks about the great performance you can get (sub 100 ms) but now I wonder if that was with filterCache enabled, and nobody bothered to look at the heap growth that results. I feel like this is sort of a catch-22 since neither configuration is really acceptable.

I'm open to any ideas. My last idea (untried) is to use geo hash (and pray that it either performs better with cache=false, or has more manageable heap growth if cache=true).

EDIT:

Precision step: default (8 for double I think)

System memory: 32GB (EC2 M2 2XL)

JVM: 24GB

Index size: 11 GB

EDIT2:

A tdouble with precisionStep of 8 means that your doubles will be splitted in sequences of 8 bits. If all your latitudes and longitudes only differ by the last sequence of 8 bits, then tdouble would have the same performance has a normal double on a range query. This is why I suggested to test a precisionStep of 4.

Question: what does this actually mean for a double value?

Upvotes: 4

Views: 1461

Answers (1)

jpountz
jpountz

Reputation: 9964

Having a profile of Solr while responding to your spatial queries would be of great help to understand what is slow, see hprof for example.

Still, here are a few ideas on how you could (perhaps) improve latency.

First you could try to test what happens when decreasing the precisionStep (try 4 for example). If the latitudes and longitudes are too close of each other and the precisionStep is too high, Lucene cannot take advantage of having several indexed values.

You could also try to give a little bit less memory to the JVM in order to give the OS cache more chances to cache frequently accessed index files.

Then, if it is still not fast enough, you could try to extend replace TrieDoubleField as a sub field by a field type that would use a frange query for the getRangeQuery method. This would reduce the number of disk access while computing the range at the cost of a higher memory usage. (I have never tested it, it might provide horrible performance as well.)

Upvotes: 1

Related Questions