Reputation: 139
I'm trying to implement deduplication in Solr by updating solrconfig.xml and schema.xml according to this link: https://lucene.apache.org/solr/guide/7_6/de-duplication.html
The deduplication generates file signatures but the signatures are being set to 0000000000000000 (16 zeros). I see this other post asking the same question but no one answered it: Solr Deduplication (dedupe) giving all zeros in signatureField
Notes:
Version: Solr 7.6.0
I updated many of the solr.processor.* classes to solr.update.processor.* after looking at the package names in the source code: https://github.com/apache/lucene-solr/tree/branch_7_6/solr/core/src/java/org/apache/solr/update/processor
My file setup:
solrconfig.xml:
<updateRequestProcessorChain name="dedupe">
<processor class="solr.update.processor.SignatureUpdateProcessorFactory">
<bool name="enabled">true</bool>
<str name="signatureField">signature</str>
<bool name="overwriteDupes">true</bool>
<str name="fields">name,content</str>
<str name="signatureClass">solr.update.processor.Lookup3Signature</str>
</processor>
<processor class="solr.update.LogUpdateProcessorFactory" />
<processor class="solr.update.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<requestHandler name="/update" class="solr.UpdateRequestHandler" >
<lst name="defaults">
<str name="update.chain">dedupe</str>
</lst>
</requestHandler>
schema.xml:
<field name="signature" type="string" stored="true" indexed="true" multiValued="false" />
Any help is appreciated! :)
Upvotes: 0
Views: 698