Reputation: 32004
For example:
There is a column "description" in a Lucene document. Let's say the content of "description" is [hello foo bar
]. I want a query [hello f
], then the document should be hit, [hello ff
]
or [hello b
] should not be hit.
I use the programmatic way to create the Query
, such as PrefixQuery
, TermQuery
were added to BooleanQuery
, but they don't work as expected. StandardAnalyzer
is used.
Test cases:
a): new PrefixQuery(new Term("description", "hello f"))
-> 0 hit
b): PhraseQuery query = new PhraseQuery();
query.add( new Term("description", "hello f*") )
-> 0 hit
c): PhraseQuery query = new PhraseQuery();
query.add( new Term("description", "hello f") )
-> 0 hit
Any recommendations? Thanks!
Upvotes: 1
Views: 3501
Reputation: 8896
It doesn't work because you are passing multiple terms to one Term
object . If you want all your search words to be prefix-found, you need to :
Tokenize the input string with your analyzer, it will split your search text "hello f" to "hello" and "f":
TokenStream tokenStream = analyzer.tokenStream(null, new StringReader(searchText)); CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute.class);
List tokens = new ArrayList(); while (tokenStream.incrementToken()) { tokens.add(termAttribute.toString()); }
Put each token into Term
object which in turn needs to be put in PrefixQuery
and all PrefixQueries
to BooleanQuery
EDIT: For example like this:
BooleanQuery booleanQuery = new BooleanQuery();
for(String token : tokens) {
booleanQuery.add(new PrefixQuery(new Term(fieldName, token)), Occur.MUST);
}
Upvotes: 1