Thien Nguyen
Thien Nguyen

Reputation: 1

Elasticsearch: Stem problem for polish on/ona

I'm new in elasticsearch and having trouble with polish using elasticsearch search query, for example on/ona (means men/women in polish). The problem is if i use standard analyzer, it seems to understand men/women filter and return correct records, but when i tried "on", it will include both on and ona in returned records. I tried using https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-keyword-marker-tokenfilter.html but it didn't work also.

Have anyone been in this situation before ? Can someone explain to me how this work and do u guys have solution for this ?

I tried

index :
    analysis :
        analyzer :
            myAnalyzer :
                type : custom
                tokenizer : standard
                filter : [lowercase, protwords, porter_stem]
        filter :
            protwords :
                type : keyword_marker
                keywords : [ona]

I expected if i search for "on", it only appears records with "on" but not "ona" and vice versa.

Upvotes: 0

Views: 134

Answers (1)

Kendall
Kendall

Reputation: 1

You can integrate Lucene IK analyzer into elasticsearch with IK Analysis plug-in, and support custom dictionary. You can add the segmentation you need in the custom dictionary. https://github.com/medcl/elasticsearch-analysis-ik I hope it will help you.

Upvotes: 0

Related Questions