Felipe Hummel
Felipe Hummel

Reputation: 4774

In Lucene how can a TokenFilter emit more than one term?

I'm working with Lucene 3.2. How can I use a TokenFilter that doesn't just filter/modify a term, but can also insert other terms into the stream?

For example, I want a filter that take as input "tv42lcd" and insert into the stream the words "tv42lcd", "tv", "42", "lcd".

I'm aware that I could do this by implementing my own Tokenizer. But I rather still use the provided StandardTokenizer.

Upvotes: 2

Views: 470

Answers (1)

mindas
mindas

Reputation: 26733

You can always mix default with custom: use StandardTokenizer logic where possible, then wrap its output and add custom tokenization on the top. You can achieve this by extending, but it's almost always better to use composition.

Upvotes: 1

Related Questions