Reputation: 2721
Lucene 4.2.1 doesnot have StandardAnalyzer, and I am not sure how to implement a basic analyzer that does not alter the source text. Any pointers?
final SimpleFSDirectory DIRECTORY = new SimpleFSDirectory(new File(ELEMENTS_INDEX_DIR));
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_42, new Analyzer() {
@Override
protected TokenStreamComponents createComponents(String s, Reader reader) {
return null;
}
});
IndexWriter indexWriter = new IndexWriter(DIRECTORY, indexWriterConfig);
List<ModelObject> elements = dao.getAll();
for (ModelObject element : elements) {
Document document = new Document();
document.add(new StringField("id", String.valueOf(element.getId()), Field.Store.YES));
document.add(new TextField("name", element.getName(), Field.Store.YES));
indexWriter.addDocument(document);
}
indexWriter.close();
Upvotes: 3
Views: 6009
Reputation: 1078
You should add the Common Analyzers to your project. They are now available in a separate JAR file in the Lucene-4.2.1.zip file under "analysis/common".
lucene-analyzers-common-4.*.jar
Once you add it to your project (as you added the core) you should have this working:
import org.apache.lucene.analysis.standard.StandardAnalyzer;
Upvotes: 2
Reputation: 33351
You have to return a TokenStreamComponents from createComponents. null
is not adequate.
However, Lucene 4.2.1 certainly does have StandardAnalyzer.
If you are, perhaps, refering to the changes in StandardAnalyzer in Lucene 4.x, and are looking for the old StandardAnalyzer, then you'll want ClassicAnalyzer.
If you really want a trimmed down Analyzer that doesn't modify anything, but just tokenizes in a very simple fashion, perhaps WhitespaceAnalyzer will serve your purposes.
If ou don't want it modified or tokenized at all, then KeywordAnalyzer.
And if you must create your very own Analyzer, as you say, then override the method createComponents
, and actually build and return an instance of TokenStreamComponents
. If none of the above four serve your needs, I have no idea what your needs are, and so I won't make an attempt a specific example here, but here is the example from the Analyzer docs
Analyzer analyzer = new Analyzer() {
@Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
Tokenizer source = new FooTokenizer(reader);
TokenStream filter = new FooFilter(source);
filter = new BarFilter(filter);
return new TokenStreamComponents(source, filter);
}
};
There is a single arg ctor for TokenStreamComponents as well, so the filter is optional, by the way.
Upvotes: 9