adfsdoesntwork
adfsdoesntwork

Reputation: 1

Fuzzy Arabic Search In Index

I try to use Elastic-search fuzzy search feature with Arabic search queries. more details about it is here: https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#fuzziness

unfortunately, I get mixed results. While sometimes I do get relevant result, which contains some errors(in this case almost all results are relevant), which are not present without the fuzzy logic. for bad queries which usually returns few results(less than 10), I get hundreds of irrelevant ones.

Does anyone know how should I treat those queries, so whenever there is a lot of noise, it will be eliminated, and when there are a lot of relevant results they all be present? How should I tune the fuzziness, so it won't be harmful?

Upvotes: 0

Views: 725

Answers (1)

Ahmed Mamdouh
Ahmed Mamdouh

Reputation: 1

I found this question lately, but I wanted to answer it because maybe someone needs it now.

Firstly you want to know what fuzziness is and what it looks like in Elasticsearch, and you should know how it works in the Arabic language because it's very challenging. In the Fuzziness context, no one can answer your specific questions, because no one will know exactly what your data content looks like because it fully depends on your actual content, and your expectations of user misspellings which may match your content, and decide if you really need to use fuzzy or something else?

In the context of irrelevant behavior you might see in Elasticsearch, you need to make sure you are using the right query, and always be aware of the query properties you might use and don't know exactly how it will affect your results. so try the query with no additional properties then add them one by one to know exactly why you see irrelevant behaviors.

Something you have to know, sometimes the issue is not in the query itself, it may be in the mapping of your index, so when you are trying to resolve an issue look into the mappings of your index and see if this will provide you with what you need? Also, make sure you are telling Elasticsearch that you are using Arabic Language not English, because it might be the issue that Elastic treats your content as English language and sure it will provide you bad results.

If anyone have any issue related to this topic, don't hesitate to reply and provide me with the full details of your issue.

Upvotes: 0

Related Questions