user1416312
user1416312

Reputation: 3

How can I get the size of Solr Facet results?

There is a multi-value field in my schema named XXX. And it may be more 10,0000 documents in my Solr, I want to get how many values exist in XXX without any duplication.

For now, I use facet.field=XXX&facet.limit=-1 to get the facet results size. It will spend a lot of time and sometimes occur Read Timeout.

What I want for the facet results is only the 'size', I don't care about the contents.

By the way, I use Solr 5.0, is there any other better solution to solve my requirement?

Upvotes: 0

Views: 553

Answers (1)

Fuu
Fuu

Reputation: 3474

The index does maintain a list of unique terms, since that is how the inverted index works. It is also very very fast to compute and return, unlike faceting. If your values are single terms, then that could be a way of getting to what you want. There is a way to get unique terms, given that the TermsComponent is enabled in your solrconfig.xml. For example:

http://localhost:8983/solr/corename/terms?q=*%3A*&wt=json&indent=true&terms=true&terms.fl=XXX

Would return a list of all unique terms, and their counts:

{
  "responseHeader":{
  "status":0,
  "QTime":0},
  "terms":{
    "XXX":[
    "John Backus",3,
    "Ada Lovelace",3,
    "Charles Babbage",2,
    "John Mauchly",1,
    "Alan Turing",1
    ]
  }
}

The length of this list is the amount of unique terms, in the example that would be 5. Unfortunately the API doesn't provide a way to just ask for the count, without returning the list of terms, so while it has speed advantage in generating the list, the amount of time required to return full list gives it a similar drawback to the facets approach. Also, the returned list may become quite long.

Check out https://wiki.apache.org/solr/TermsComponent for the API details.

Upvotes: 0

Related Questions