user32275
user32275

Reputation: 11

Get payloads for matching terms in solr using Custom Document Transformer

In Solr I have a custom fieldType called "payloads" which support payloads

<fieldtype name="payloads" stored="true" indexed="true" class="solr.TextField" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="com.abc.CustomPayloadTokenFilterFactory" encoder="custom"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldtype>      

I have defined a field of this type :

<field name = "somefield" type="payloads" indexed="true" stored="true"  multiValued = "true" omitNorms="true" />

The contents of "somefield" will look like : ["abcd|payload1", "xyz|payload2", "mnop|payload3" ] ( It can extend to 1K words )

Let say my query term is "xyz". I want to return just "xyz|payload2" or better just "payload2".

I have written a custom DocumentTransformer in Solr which upon matching a document against my query can parse the field and return "payload2".

But if feels like I should be able to extract "payload2" without having to parse the whole field since internally solr might have this information indexed.

I am trying to write another Document Transformer which can just return payload using PostingsEnum:

IndexReader reader = this.context.getSearcher().getIndexReader();
final TermsEnum termsEnum = MultiFields.getTerms(
    reader, this.kField).iterator();
String term = "xyz";
PostingsEnum postingsEnum = MultiFields.getTermDocsEnum(
    reader,
    "somefield",
    new BytesRef(term));

if (termsEnum.seekExact(new BytesRef(term))) {
    PostingsEnum pe = termsEnum.postings(postingsEnum, PostingsEnum.ALL);

    int nextDoc = pe.advance(docid);
    postingsEnum.advance(docid);

    if (nextDoc == docid) { 
        if (sb.length() > 0)
            sb.append(",");
        sb.append(term );
        sb.append(pe.getPayload());
    }
}

But when I do "pe.getPayload()" I am just getting "null". Any suggestion/pointers on what can be possibly wrong with above code and why the payload is not present ?

(Note: The scenario presented is very simplistic and in actual there are other things present in the document and query, so please don't suggest changing schema or not using payloads.)

Upvotes: 1

Views: 288

Answers (0)

Related Questions