Reputation: 5250
I want to have automaticly generated ids for my solr documents, I do it exactly as in Solr Cook Book, but it doesn't work. I get this exception (running default on Jetty).
ERROR org.apache.solr.core.CoreContainer – Unable to create core: collection1
org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField.
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:618)
at org.apache.solr.core.CoreContainer.createFromLocal(CoreConta
Did I miss something?
My schema.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="transcripts" version="1.5">
<fields>
<field name="id" type="uuid" indexed="true" stored="true" default="NEW" required="true"/>
<field name="stime" type="long" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="etime" type="long" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="speakerid" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="speakergender" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="videoid" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="transcriptLIUM" type="text_en_splitting" indexed="true" stored="true" multiValued="false" required="false"/>
<field name="transcriptLIMSI" type="text_en_splitting" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
</fields>
<types>
<fieldType name="uuid" class="solr.UUIDField" indexed="true" />
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
<!-- A text field with defaults appropriate for English, plus
aggressive word-splitting and autophrase features enabled.
This field is just like text_en, except it adds
WordDelimiterFilter to enable splitting and matching of
words on case-change, alpha numeric boundaries, and
non-alphanumeric chars. This means certain compound word
cases will work, for example query "wi fi" will match
document "WiFi" or "wi-fi".
-->
<fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- TODO zde nahradi nas THD tokenizer - use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
<!-- Less flexible matching, but less false matches. Probably not ideal for product names,
but may be good for SKUs. Can insert dashes in the wrong place and still match. -->
<fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
<!-- this filter can remove any duplicate tokens that appear at the same position - sometimes
possible with WordDelimiterFilter in conjuncton with stemming. -->
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
</types>
</schema>
Upvotes: 0
Views: 1344
Reputation: 311
The Solr docs tell, that default "NEW" shall not be used when the UUID-type field is meant be used as unique key also. Additionally it tells, to not make use of default "NEW" in Solr Cloud environments, since this would lead to different UUIDs being generated in every replica.
Instead make use of the 'UUIDUpdateProcessorFactory' to generate the ID with an update processor chain.
The following stackoverflow thread contains a hint, how the processor chain should be configured: Configuring Solr to use UUID as a key
If you do not intend to define a custom request handler, you may pass the query parameter to the query URL, e.g. http://<host>:<port>/solr/<core>/update?commit=true&update.chain=<your chain name>
Upvotes: 0
Reputation: 52799
Query elevation needs you to define a unique key element in the schema.xml.
<uniqueKey>fileid</uniqueKey>
Also, the unique key should be unique as in your case the default is NEW and may not be unique.
Also note
Upvotes: 0
Reputation: 2549
If you want to keep query elevation read UniqueKey Wiki. Especially the "UUID techniques" segment.
Upvotes: 1