Reputation: 2343
I've just started trying to use Solr, and already I think that I'm attempting to use it backwards. Could someone let me know if what I'm trying to do is possible?
In normal use, one might specify a phrase and then search stored documents for instances of that phrase. However, I have a list of stored phrases and I'm trying to determine which of those phrases my query string contains.
For instance: suppose that I have phrases like these stored in Solr:
1:"fish fingers"
2:"apple pie"
If my search term is "I like fish fingers" then I want Solr to return the first record. If it's "I like fish fingers and apple pie" then I want it to return both records. But if it's "I like apple fingers and fish pie" then I want it to return no records.
(Of course, if the phrases were always two words then it would be pretty simple to do this by constructing a disjunctive query with all the two word phrases. But the phrases can potentially be any length.).
Thanks for any help.
Upvotes: 4
Views: 2925
Reputation: 2343
I decided to read through the documentation on each Filter and Tokenizer, which is where I came across this description of the PositionFilterFactory:
Another example is when exact matching hits
are wanted for _any_ shingle within the query
The configuration given on this page is nearly exactly what I want. Unfortunately, since there doesn't seem to be a filter which glues terms split by the tokenizer back into a single token, I can't do any stemming. But maybe I can knock up such a filter myself.
Upvotes: 2
Reputation: 20621
I believe shingles - token n-grams used for matching - could be a start in solving your problem.
Check out ShingleFilterFactory and ShingleFilter.
Upvotes: 2
Reputation: 52779
This seems to be the same functionality as keymatch search provided by google search appliance, where it tries to match the indexed terms to queries rather than the other way around. And we too had to implement a custom solution.
You would probably need to implement your own query parser for the same.
And also as you already mentioned, probably thats the only solution you have.
Upvotes: 1