How to add last two digits of year to hibernate search / lucene index

Question

In my database I store years in it's complete form. Example, 2012,2013,2014 etc. This is also how they are being stored in my index. I'm looking to also store the last two digits in the index as well. Example 12,13,14 etc. I basically want to enable individuals to be able to do a keyword search on 2012 and 12.

My main search analyzer looks like this.

@AnalyzerDefs({
    @AnalyzerDef(name = "searchtokenanalyzer",
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "([^a-zA-Z0-9\-])"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            }),

I have a second analyzer for handling the year abbreviation that looks like this.

@AnalyzerDef(name = "yearanalyzer",
            // Split input into tokens according to tokenizer
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "^.{2}"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            })

And on my entity field I have the following.

@Entity
@Indexed
public class YearLookup 
    @Fields({
            @Field(name = "name", store = Store.NO, index = Index.YES,
                    analyze = Analyze.YES, analyzer = @Analyzer(definition = "searchtokenanalyzer")),
            @Field(name = "abbr", store = Store.NO, index = Index.YES, 
                    analyze = Analyze.YES, analyzer = @Analyzer(definition = "yearanalyzer"))
        })
        private String name;
    }

Now so far everything is making in the index correctly, I can see

name 2012,2013,2014
abbr 12,13,14

Now when I do a search against against YearLookup.class with the following code. The abbr year gets cut down by two digits again creating a null value while name remains in tact.

public interface SearchParam {
    public static final String[] SEARCH_FIELDS = new String[]{"yearLookup.name", "yearLookup.abbr"};
}

String searchString = "14";

QueryBuilder queryBuilder = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(YearLookup.class).get();

ermMatchingContext onWildCardFields = queryBuilder.keyword().wildcard().onField(SearchParam.SEARCH_FIELDS[0]);
            TermMatchingContext onFuzzyFields = queryBuilder.keyword().fuzzy().withThreshold(0.7f)
                    .withPrefixLength(1).onField(SearchParam.SEARCH_FIELDS[0]);

            //Iterate over all the remaining search fields stored in the "VehicleListing" index 
            for (int i = 1; i < SearchParam.SEARCH_FIELDS.length; i++) {
                onWildCardFields.andField(SearchParam.SEARCH_FIELDS[i]);
                onFuzzyFields.andField(SearchParam.SEARCH_FIELDS[i]);
            }

            String[] tokens = searchString.toLowerCase().split("\s");

            for (String token : tokens) {
                luceneQuery = queryBuilder.bool()
                        .should(onWildCardFields.matching(token + "*").createQuery())
                        .should(onFuzzyFields.matching(token).createQuery())
                        .createQuery();
            }

FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(luceneQuery, YearLookup.class);

Integer results = fullTextQuery.getResultSize();

Now when I run my test case against this. I get the following exception.

HSEARCH000146: The query string '14' applied on field 'yearLookup.abbr' has no meaningfull tokens to be matched. Validate the query input against the Analyzer applied on this field. org.hibernate.search.errors.EmptyQueryException at org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:111) at org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:86) at com.domain.auto.services.search.impl.SearchManagerImpl.doSearch(SearchManagerImpl.java:146) at $SearchManager_138fdc525111b303.doSearch(Unknown Source) at $SearchManager_138fdc525111b2f3.doSearch(Unknown Source) at com.domain.auto.services.search.impl.SearchServiceImplTest.testYearSearch(SearchServiceImplTest.java:92)

Anybody have any thoughts?

Code Junkie · Accepted Answer

Solution

@AnalyzerDef(name = "yearanalyzer",
        // Split input into tokens according to tokenizer
        // Split input into tokens according to tokenizer
        tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
        filters = {
            @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                @Parameter(name = "pattern", value = "^\d{2}(\d{2})$"),
                @Parameter(name = "replacement", value = "$1"),
                @Parameter(name = "replace", value = "all")}),
        })

How to add last two digits of year to hibernate search / lucene index

Answers (2)

Related Questions