Eduardo Lopes
Eduardo Lopes

Reputation: 161

How to implement a phonetic search using Lucene?

I want to implement a phonetic search using Lucene 6.1.0., using Soundex or any suitable algorithm for Portuguese. I found many incomplete examples over internet, teaching how to implement a custom tokenizer, analyzer, but it seems that the abstract classes used on those exapmples are not the same in the version 6.1.0. Can anyone point me out where I can find a good documentation an Lucene, not just java docs without any further documentation teaching how to put the things together?

Thanks in advance.

Upvotes: 4

Views: 3234

Answers (1)

femtoRgon
femtoRgon

Reputation: 33351

The Analyzer documentation shows how to create your analyzer.

For phonetic analysis, you should look to the org.apache.lucene.analysis.phonetic package (You'll need to add "lucene-analyzers-phonetic-6.1.0.jar" to your build path, as well as Apache's "commons-codec-1.10.jar", which you can get here).

Then you can setup your analyzer something like, for instance:

Analyzer analyzer = new Analyzer() {
    @Override
    protected TokenStreamComponents createComponents(String fieldName) {
        Tokenizer tokenizer = new StandardTokenizer();
        TokenStream stream = new DoubleMetaphoneFilter(tokenizer, 6, false);
        return new TokenStreamComponents(tokenizer, stream);
    }
};

Upvotes: 8

Related Questions