dotancohen
dotancohen

Reputation: 31471

Which words appear the most common in an indexed field?

How could I query Solr for the most common indexed words? For example, given these fields for each document:

I would like Solr to return to me, in any format, the following output:

Thanks.

Upvotes: 5

Views: 4107

Answers (3)

dotancohen
dotancohen

Reputation: 31471

The Terms Component seems well-suited to the task. Here is an article about Self Updating Solr Stopwords which uses the Terms Component to find the 1000 most common indexed words and add them to the Stopwords file.

Finding the 1000 indexed keywords (sorted by frequency descending):

http://url.to.solr/solr/terms?terms.fl=MY_FIELD&terms.limit=1000

Upvotes: 5

d whelan
d whelan

Reputation: 804

Use the luke request handler

http://wiki.apache.org/solr/LukeRequestHandler

example:

http://localhost:8983/solr/admin/luke?fl=Your_Indexed_Field&numTerms=500

Upvotes: 8

Ansari
Ansari

Reputation: 8218

This isn't exactly the use case for Solr as far as I know but it can be done with faceting. No guarantees about performance though. Make sure your field is set to be tokenized properly, and then run a query as usual but with the following additional parameters at the end:

&facet=true&facet.field=yourfield

Replace yourfield with the name of the field you have your data stored in.

Upvotes: 0

Related Questions