Shivan Dragon
Shivan Dragon

Reputation: 15229

Solr Cluster + DataImportHandler: can I have autogenerated id?

I'm using Solr 4.3. I've created 4 shards. I configured UniqueKey autogenerated field as described here:

http://wiki.apache.org/solr/UniqueKey

It works fine if I use the actual update handler to insert documents (i.e. if I make a HTTP POST to /update with some JSON data, the unique key is autogenerated for each document).

If however I use the DataImportHandler to pull some documents from database, they are not added to the index, instead I see a warning in the Solr log saying that "mandatory id field is missing".

I know the DataImportHandler doesn't go through the UpdateHandler to add documents, but I was hoping this feature would work for DIH as well...

So my question is: does anybody know how to make work the id autogeneration for a Solr 4.3 cluster when using the DataImportHandler to insert documents?

Upvotes: 0

Views: 376

Answers (1)

Shivan Dragon
Shivan Dragon

Reputation: 15229

Well, the solution I ended up using was this

  • created a custom transformer in Java (actually I was already using one - I find it's faster than doing them in JS - the other option Solr offers)
  • Inside the transformer I pretty much do what the UUIDUpdateProcessorFactory does: add

    @Override
    public Object transformRow(Map<String, Object> row, Context context) {
        row.put("id", UUID.randomUUID());
    
  • I then removed the <updateRequestProcessorChain name="uuid"> tag from my solrconfig.xml, and only left the schema.xml configuration as per the link in the question

Upvotes: 2

Related Questions