Frank
Frank

Reputation: 530

Solr/Solarium unique fieldvalue count

I'm using the Solarium PHP library to connect to a SOLR instance. I have an index with around 3.5 mio documents. Searching and filtering works great, but I have one thing that just doesn't seem to work quite well with SOLR.

The documents describe companies. Now I want to know how many unique phonenumbers are in the index given a specific query. Some companies are related and share the phonenumber, some don't have a phonenumber at all.

Facets are not really an option since they are limited to 100 results per request. For 3.5 mio documents that would mean a lot of requests. I tried to use the getStats() option, but that was slow too. I finally resided to GroupComponent queries, which seem to do the job.

Still if there are a lot of results (100k+) in the resultset, it is loading for a very long time and eventually crashing SOLR. I increased the memory limits to prevent the crashes, but it is still not loading within decent time constraints. This is my code:

 $groupComponent = $select->getGrouping();
 $groupComponent->addField('phone');
 $groupComponent->setNumberOfGroups(true);
 $groupComponent->setLimit(0);
 $groupComponent->setTruncate(true);
 $groupComponent->setFormat('simple');
 $groupComponent->setFacet(true);

 $resultset = $this->client->execute($select);
 $groups = $resultset->getGrouping();

I actually only need the counts, not the results. I set the limit to 0, but I'm not sure if that stands for zero or unlimited in this case. If I put it to 1 it doesn't make any difference. So I'm not sure if it is possible to just get the counts. I have also tried to add $groupComponent->setMainresult(true); but that doesn't make it faster and seems to return 0 all the time for the number of phonenumbers.

If anybody has a suggestion for speeding up the process in Solarium or directly in SOLR I love to hear it. Thanks!

Upvotes: 0

Views: 131

Answers (0)

Related Questions