solidfox
solidfox

Reputation: 571

Lucene Analyzer

I have worked with Lucene for indexing documents and providing search among them, however, my work was in English language, but now, I have a project which is Kurdish language, Kurdish language uses some Arabic unicode characters and several other characters, here is Table of Unicode Characters used in Kurdish-Arabic script

My question is how to create Analyzer for this language, or can I use Arabic Analyzer for this purpose?

Upvotes: 3

Views: 694

Answers (2)

Bart Czernicki
Bart Czernicki

Reputation: 3683

To answer your question about howto create a custom Analyzer for a new language..."Lucene In Action" book covers the creation of custom analyzers and it is pretty detailed. You can "leverage" a lot of the code found in other analyzers and just change what you need. Lucene is open source and very extensible, therefore profiling these changes is pretty easy.

Upvotes: 1

mindas
mindas

Reputation: 26703

Lucene has a list of other analyzers, including Arabic. I'm afraid there's no one which targets specifically Kurdish, but maybe you can extend Arabic analyzer to fit your needs?

Just bear in mind that all these analyzers come separately from the main Lucene distribution.

Upvotes: 1

Related Questions