Reputation: 4142
I understand how to obtain the document set from a Term object, but can you go the other way around to obtain the terms/term frequencies from a Document object?
Upvotes: 1
Views: 700
Reputation: 5042
Yes, it is possible get terms from a document, but there are no easy APIs. IndexReader has a a method getTermFreqVector where you can retrieve terms in a document. You need to build a custom TermVectorMapper and pass it getTermFreqVector().
In the custom TVMapper, terms and their frequencies are collected in map()
method. Once the getTermFreqVector()
returns, terms can be retrieved from the custom TVMapper.
Upvotes: 1