Reputation: 11
How can one extract all the tokens from solr? Not from one document, but from all the documents indexed in solr?
Thanks!
Upvotes: 1
Views: 324
Reputation: 20859
You may do something like this(This sample is approved to be working on a lucene 4.x index):
IndexSearcher isearcher = new IndexSearcher(dir, true);
IndexReader reader = isearcher.getIndexReader();
Fields fields = MultiFields.getFields(reader);
Collection<String> cols = reader.getFieldNames(IndexReader.FieldOption.ALL);
for (String col : cols) {
Terms te = fields.terms(col);
if (te != null) {
TermsEnum tex = te.getThreadTermsEnum();
while (tex.next() != null)
// do something
tex.getTerm().text();
}
}
This iterates over all columns and also over every term per col. You may lookup the methods provided by TermsEnum like getTerm()
.
Upvotes: 1