Bald Eagle
Bald Eagle

Reputation: 95

LUCENE - Fuzzy Search on a word containing Space

The case I am facing seems very simple, but truly I can't imagine a clear solution:

I imagine that using a fuzzyQuery will suffice (since the distance is 1). since the tokenizer I use split based on the spaces, the solution isn't relevant I don't know which analyzer to use to allow this possibility? while keeping all the benefits of a StandardAnalyzer'like (Stopwords, possibility to add synonyms,...).

Maybe it's simpler than I think (at least it seems so), but I really can't see any solution for now.

Upvotes: 0

Views: 452

Answers (1)

MatsLindh
MatsLindh

Reputation: 52802

You can use a ShingleFilter to make Solr combine multiple tokens into one, with a user define separator.

That way you'll get "summer time" as a single token, as well as "summer" and "time" (unless you disable outputUnigrams). When you do this you'll get tokens with a small edit distance, and the fuzzy search should work as you want it to.

Upvotes: 1

Related Questions