gstackoverflow
gstackoverflow

Reputation: 37034

hibernate search case insensitive search is not corretly work with LowerCaseFilterFactory

I have following configration for hibernate-search:

@AnalyzerDef(name = "autocompleteNGramAnalyzer",

// Split input into tokens according to tokenizer
        tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),

        filters = {
                // Normalize token text to lowercase, as the user is unlikely to
                // care about casing when searching for matches
                @TokenFilterDef(factory = WordDelimiterFilterFactory.class,
                        params = @Parameter(name = "catenateAll", value = "1")),
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = {
                        @Parameter(name = "minGramSize", value = "2"),
                        @Parameter(name = "maxGramSize", value = "5")})})

The behaviour is really strange.

I have field with value George Cain

if I search by Ge - it returns value
if I search by GeO - it returns value
if I search by GeOR - it doesn't returns anything
if I search by GeoR - it returns value
if I search by GEOR - it returns value

What bad with GeOR ? How can I fix this?

Is it possible to debug this framework?

Upvotes: 0

Views: 559

Answers (2)

gstackoverflow
gstackoverflow

Reputation: 37034

I customized WordDelimiterFilterFactory and now this works:

 @TokenFilterDef(factory = WordDelimiterFilterFactory.class,
                        params = {
                                @Parameter(name = "catenateAll", value = "1"),
                                @Parameter(name = "generateWordParts", value = "0")})//generateWordParts = 1 by default

Upvotes: 0

Guillaume Smet
Guillaume Smet

Reputation: 10519

First, try to use Luke to see what has been indexed in your Lucene index: https://github.com/DmitryKey/luke/releases . You will be able to see the tokens, which might help you to understand what is happening.

Be sure your analyzer is correctly defined on your field and the analyzer is applied to your query too (might be a good idea to show us how you defined your field and how you execute your query).

If you end up thinking it's a bug, you can use our https://github.com/hibernate/hibernate-test-case-templates/tree/master/search/hibernate-search-lucene to provide us a self contained test case so that we can take a look.

Upvotes: 2

Related Questions