Michael Stum
Michael Stum

Reputation: 181104

How to create a composite Key Field in Apache Solr?

I have an Apache Solr 3.5 setup that has a SchemaXml like this:

<field name="appid" type="string" indexed="true" stored="true" required="true"/>
<field name="docid" type="string" indexed="true" stored="true" required="true"/>

What I would need is a field that concatenates them together and uses that as <uniqueKey>. There seems nothing built-in, short of creating a multi-valued id field and using <copyField>, but it seems uniqueKey requires a single-valued field.

The only reason I need this is to allow clients to blindly fire <add> calls and have Solr figure out if it's an addition or update. Therefore, I don't care too much how the ID looks like.

I assume I would have to write my own Analyzer or Tokenizer? I'm just starting out learning Solr, so I'm not 100% sure what I'd actually need and would appreciate any hints towards what I need to implement.

Upvotes: 3

Views: 4130

Answers (1)

javanna
javanna

Reputation: 60245

I would personally give that burden to the users, since it's pretty easy for them adding a field to each document.

Otherwise, you would have to write a few lines of code I guess. You could write your own UpdateRequestProcessorFactory which adds the new field automatically to every input document based on the value of other existing fields. You can use a separator and keep it single value. On your UpdateRequestProcessor you should override the processAdd method like this:

@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
    SolrInputDocument doc = cmd.getSolrInputDocument();
    String appid = (String)doc.getFieldValue( "appid" );
    String docid = (String)doc.getFieldValue( "docid" );
    doc.addField("uniqueid", appid + "-" + docid);    
    // pass it up the chain
    super.processAdd(cmd);
}

Then you should add your UpdateProcessor to your customized updateRequestProcessorChain as the first processor in the chain (solrconfig.xml):

<updateRequestProcessorChain name="mychain" >
    <processor class="my.package.MyUpdateRequestProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
    <processor class="solr.LogUpdateProcessorFactory" />
</updateRequestProcessorChain>

Hope it works, I haven't tried it. I already did something like this but not with uniqueKey or required fields, that's the only problem you could find. But I guess if you put the updateProcessor at the beginning of the chain, it should work.

Upvotes: 5

Related Questions