Reputation: 21
I have a field "text", which I need to copy to text_en or text_es based on the language of "text". Below is my managed_schema.xml:
<updateRequestProcessorChain name="langid">
<processor class="org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory">
<bool name="langid">true</bool>
<str name="langid.fl">text</str>
<str name="langid.langField">tweet_lang</str>
<str name="langid.whitelist">es,en</str>
<bool name="langid.map">true</bool>
<!--bool name="langid.map.individual">true</bool-->
<str name="langid.map.individual.fl">text</str>
<bool name="langid.map.keepOrig">true</bool>
<str name="langid.fallback">ko</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
I created a copy field text_en and text_es.When I post the data in spanish, data is copied from text to text_en and text_es as well!
How do I solve this?
Thanks!
Upvotes: 1
Views: 1140
Reputation: 21
Thanks for the headsup! The issue is solved by removing the copy fields and created dynamic fields
*_es
and *_en
in schema.xml
Upvotes: 0
Reputation: 16095
By creating copyFields from text
to text_en
and text_es
you get incoming data into both fields regardless of the langage detection, that is what copyField is supposed to do.
The updateRequestProcessor will actually make a copy (rather than a move) because you set <bool name="langid.map.keepOrig">true</bool>
.
Other than that, the processor's config looks fine, just remove these copyFields and ensure the mapped fields text_en
and text_es
are well defined in your schema.
Upvotes: 1