Derek Hunziker
Derek Hunziker

Reputation: 13141

How to perform a wildcard search in Lucene

I know that Lucene has extensive support for wildcard searches and I know you can search for things like:

Stackover* (which will return Stackoverflow)

That said, my users aren't interested in learning a query syntax. Can Lucene perform this type of wildcard search using an out-of-box Analyzer? Or should I append "*" to every search query?

Upvotes: 3

Views: 11052

Answers (3)

sisve
sisve

Reputation: 19791

Doing this with string manipulations is tricky to get right, especially since the QueryParser supports boosting, phrases, etc.

You could use a QueryVisitor that rewrites TermQuery into PrefixQuery.

public class PrefixRewriter : QueryVisitor {
    protected override Query VisitTermQuery(TermQuery query) {
        var term = query.GetTerm();
        var newQuery = new PrefixQuery(term);
        return CopyBoost(query, newQuery);
    }
}

The QueryVisitor base class can currently be found at gitlab.

The code was initially posted on a blog post which is now defunct. The blog post is still available at archive.org.

Upvotes: 4

Robert Muir
Robert Muir

Reputation: 3195

If you are considering turning every query into a wildcard, I would ask myself these questions:

  1. Is Lucene the best tool for the job? by default wildcards rewrite to constant-score queries, which means you are throwing away relevance ranking completely and no longer "searching" but instead "matching". Perhaps for your application a search engine library is not the best solution and another tool (e.g. database) would be better.
  2. If the answer to #1 is still 'yes', then I would recommend taking a look at what the exact relevance problem is that you are trying to solve. For example, if its that you want queries to match compound or stemmed words, maybe instead add a decompounder or stemmer to your analysis chain instead. You can also consider using an n-gram indexing technique as another alternative.

Upvotes: 3

Mark Broadhurst
Mark Broadhurst

Reputation: 2695

If I want to do something like that I normally format the term before searching e.g.

searchTerm = QueryParser.EscapesearchTerm);
if(!searchTerm.EndsWith(" "))
{
    searchTerm = string.Format("{0}*", searchTerm);
}

which will escape any special characters people have put in. and if the term doesnt ends with a space appends a * on the end. Since * on its own would cause a parsing exception.

Upvotes: 0

Related Questions