Reputation:
When you search Stackoverflow or the Internet for LuceneAnalysisDefinitionProvider
, you'll find hundreds of pages, each of them having the same code copied from another page without any decent explanation or further examples of usage.
So I tried to do it by myself and failed. Here is my code:
public class CustomLuceneAnalysisDefinitionProvider
implements LuceneAnalysisDefinitionProvider {
@Override
public void register(final LuceneAnalysisDefinitionRegistryBuilder builder) {
builder
.analyzer("customAnalyzer")
.tokenizer(StandardTokenizerFactory.class)
.charFilter(MappingCharFilterFactory.class)
.param("mapping",
"org/hibernate/search/test/analyzer/mapping-chars.properties")
.tokenFilter(ASCIIFoldingFilterFactory.class)
.tokenFilter(LowerCaseFilterFactory.class)
.tokenFilter(StopFilterFactory.class)
// WRONG! It's not "mapping"!
// .param("mapping",
// "org/hibernate/search/test/analyzer/stoplist.properties")
.param("words",
"classpath:/stoplist.properties")
.param("ignoreCase", "true");
}
}
Now we have CustomLuceneAnalysisDefinitionProvider
and what's next?
mapping-chars.properties
when adding it as a parameter
to MappingCharFilterFactory
?mapping-chars.properties
and how to create mine of modify existing?stoplist.properties
and how to address it when adding as mapping
parameter to StopFilterFactory
?customAnalyzer
to single @Field
mentioned below?@Field(
index = Index.YES,
analyze = Analyze.YES,
store = Store.NO,
bridge = @FieldBridge(impl = LocalizedFieldBridge.class)
)
private LocalizedField description;
On some pages I found option to put this definition into application.properties:
hibernate.search.lucene.analysis_definition_provider = com.thevegcat.app.search.CustomAnalysisDefinitionProvider
But I don't want to replace original analyzer, I just want to use custom analyzer for few specific properties.
EDIT#1
Looking into org.apache.lucene.analysis.core.StopFilterFactory
line 86, one can notice it takes words
as a key, not mapping
.
EDIT#2
If you put your stop words file in src/main/resources, then you have to address it:
.param("words", "classpath:/stoplist.properties")
Upvotes: 3
Views: 297
Reputation: 9977
you'll find hundreds of pages, each of them having the same code copied from another page without any decent explanation or further examples of usage.
Hibernate Search 5 had its problems, one of which was lack of documentation in some areas. Now that it's in maintenance mode, those problems are unlikely to get addressed.
There is some documentation for that feature in the Hibernate Search 5 documentation: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#section-programmatic-analyzer-definition
You'll get better documentation of that feature by migrating to Hibernate Search 6+.
That being said, most of your questions related to Lucene features, so you probably won't find answers in Hibernate Search's documentation. You could find them in Lucene's documentation. How to find such documentation is explained in the Hibernate Search 6 documentation:
To know more about the behavior of these character filters, tokenizers and token filters, either browse the Lucene Javadoc or read the corresponding section on the Solr Wiki (you don’t need Solr to use these analyzers, it’s just that there is no documentation page for Lucene proper).
Where to put and how to address mapping-chars.properties when adding it as a parameter to MappingCharFilterFactory?
In your classpath.
What is the contents of mapping-chars.properties and how to create mine of modify existing?
That's the kind of things that Lucene doesn't document, at least not clearly. Solr's documentation is better: https://solr.apache.org/guide/6_6/charfilterfactories.html#CharFilterFactories-solr.MappingCharFilterFactory
Where to put stoplist.properties and how to address it when adding as mapping parameter to StopFilterFactory?
Put it in the classpath, and pass the path to that file from the root of your classpath.
How to add previously defined customAnalyzer to single @Field mentioned below?
Well that is documented, at least: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_referencing_named_analyzers
@Field(analyzer = @Analyzer(definition = "customAnalyzer"))
On some pages I found option to put this definition into application.properties:
hibernate.search.lucene.analysis_definition_provider = com.thevegcat.app.search.CustomAnalysisDefinitionProvider
But I don't want to replace original analyzer, I just want to use custom analyzer for few specific properties.
You won't replace an "analyzer", you will register an analysis definition provider. Which will add analyzer definitions to Hibernate Search, which can then be referenced from @Field
. Setting an analysis definition provider does not, in itself, change your mapping in any way.
Upvotes: 1