Reputation: 209
I have been using solr for a while now (acts_as_solr) , but just came across a very strange one I can't seem to get working.
I have a 'text' field lets call it
audience = [students, teachers, students_teachers, none]
when I send the query
q= audience:students
it returns only those with it set to students.
yet if I do
fq= audience:students
I get back results with both [students, students_teachers]
I have tried putting quotes, parens and all sorts around the filter query, but it seems like it is not honoring them as I would expect. I am actually using a negation side of fq here, to hide from the user some results.
I am using solr 1.4.1
Any thoughts? I am about to change the options to unique words with no reuse. Might be an issue with the _ 's in the names.
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
Upvotes: 1
Views: 789
Reputation: 99730
WordDelimiterFilterFactory in your field type is probably generating the terms "students" and "teachers" from the string "students_teachers".
So when you search for "students", it also matches the string that originally was "students_teachers".
As an aside: Solr is a very configurable tool and can be quite complex, I recommend not treating it as a black box or you'll very probably have more and more of these "WTF moments".
Upvotes: 2