Tioma
Tioma

Reputation: 2150

SolrCloud: workaround for classic pagination with "start,rows" parameters

I have SolrCloud with 3 shards.

My purpose: select and process all products from category.

Current implementation: Portion selection in cycle.

...

But growing "start", performance is down. Explanation here: https://wiki.apache.org/solr/DistributedSearch

Makes it more inefficient to use a high "start" parameter. For example, if you request start=500000&rows=25 on an index with 500,000+ docs per shard, this will currently result in 500,000 records getting sent over the network from the shard to the coordinating Solr instance. If you had a single-shard index, in contrast, only 25 records would ever get sent over the network. (Granted, setting start this high is not something many people need to do.)

What ideas how I can walk around all records in category?

Upvotes: 0

Views: 968

Answers (2)

MatsLindh
MatsLindh

Reputation: 52802

There is another way to do more effective pagination in Solr - Cursors - which uses the current place in the sort instead. This is particularly useful for deep pagination.

See the section about Cursors at the Pagination of Results wiki page. This should speed up delivery as the Server should be able to do a sort of its local documents, decide where it is in that sequence and return 25 documents after that document.

UPDATE: Also useful link coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets

Upvotes: 3

orangepips
orangepips

Reputation: 9961

I think the short answer is "no" - it's a limitation of how Solr does sharding. Instead, can you amass a list of document unique keys outside of Solr - presumably from a backing database - and then retrieve from the index using sets of those keys instead?

e.g. ID:(1 OR 2 OR 3 OR ...very long list...)

Or, if the unique keys are numeric you could use a moving range instead:

ID:[1 TO 1000] then ID:[1001 TO 2000] and so forth.

In both options above you'd also restrict by category as well. They both should avoid the slow down associated with windowing however.

Upvotes: 0

Related Questions