CruftyCraft
CruftyCraft

Reputation: 781

How to do partial beginning matches in Solr?

I'm trying to search for partial beginning matches on a big list of lastnames. So Wein* should find Weinberg, Weinkamm etc.

I could do this by creating a special field, and adding

<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="50" preserveOriginal="1"/>

to its type specification in schema.xml. When I add the line above only to the indexing analyzer and leave it empty for the query analyzer, I can then search by just search special_field:Wein and get the expected results.

Now I see that solr also has a *-syntax. What's the connection between EdgeNGramFilterFactory and the *-syntax?

Am I doing things correctly or is there a better, more regular way?

Thanks!

Upvotes: 6

Views: 5571

Answers (3)

Petah
Petah

Reputation: 46050

Or just do a simple wild card match:

name:Pe*

Upvotes: 3

CruftyCraft
CruftyCraft

Reputation: 781

Note: I also asked this question in the Lucene forum where I got a good answer: http://lucene.472066.n3.nabble.com/How-to-do-partial-beginning-matches-td781147.html

Upvotes: 1

Rodes
Rodes

Reputation: 124

I don't recommend the Wein* query. That is implemented internally as PrefixQuery, which rewrites the original query to include all terms that have prefix equals "Wein". Depending on how large is your index (I mean how many terms), this query rewriting can be a bottleneck.

The EdgeNGramFilter at index time is a better approach. This solution will use more space, but queries will be processed much faster.

Upvotes: 1

Related Questions