Reputation: 45
I have the problem with Apache Solr.
Into my result i have parameter named url. It's returns some results, like this.
http://domain.com/re-RU/someLink
http://domain.com/de-DE/someLink
http://domain.com/en-EN/someLink
http://domain.com/cl-EN/someLink
http://domain.com/ka-EN/someLink
When i added a filtering query parameter to my query:
http://ip:port/solr/example/select?q=someSentence&fq=url:ru-RU&wt=json&indent=true
It's working very well, but only for de-DE
, ru-RU
landuages.
When i trying to filter something with en-EN
, i getting result contains cl-EN
, ka-EN
too
Where is the problem? How to resolve my issue?
Upvotes: 0
Views: 694
Reputation: 2166
Create an analyzer urlFilter
in your schema.xml
as below .
<fieldType name="urlFilter" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateNumberParts="1" stemEnglishPossessive="1"generateWordParts="1" preserveOriginal="1" catenateWords="1"/>
<filter class="solr.LowercaseFilterFactory"/>
</analyzer>
Then use above analyser as the type
for your url field in schema.xml
as below
<field name="url" type="urlFilter" indexed="true" stored="true"/>
And then, query like this
http://ip:port/solr/example/select?q=someSentence&fq=url:*ru-RU*&wt=json&indent=true
This will 100% work . Let me know if that helps you :) .
Upvotes: 1
Reputation: 426
You need to check your schema.xml as your url might be broken on "-" like in en-EN,it might be creating tokens en and EN separately . For example, if you are using StandardTokenizerFactory as your tokenizer class, then en-EN will be broken as en and EN, de-DE into de and DE. Similarly when you are querying you need to check which tokenizer you should use while querying because if you are using StandardTokenizerFactory while querying then fq=en-EN will also be broken into tokens en and EN. For more about tokenizers, please check : https://cwiki.apache.org/confluence/display/solr/Tokenizers
Upvotes: 2