Searching for Terms with whitespace using Lucene

Question

I'm trying to use Lucene to add a search feature but can't seem to get an index to work with significant whitespace. I've got the following test case setup:

RAMDirectory directory = new RAMDirectory();
KeywordAnalyzer analyzer = new KeywordAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter writer = new IndexWriter(directory, config);
Document doc = new Document();
doc.add(new TextField("content", "Bill Evans", Field.Store.NO));
writer.addDocument(doc);
writer.close();

IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);

QueryParser parser = new QueryParser("content", analyzer);
parser.setSplitOnWhitespace(false);
Query query = parser.parse("Bill E");

TopDocs docs = searcher.search(query, 1);
assertTrue(docs.totalHits > 0);

I'm using Lucene 6.6.0 and from what I understand the KeywordAnalyzer is what I'm looking for:

"Tokenizes" the entire stream as a single token. This is useful for data like zip codes, ids, and some product names.

But I can't seem to get any matching documents that contain whitespace.

Any ideas on how to solve this?

Searching for Terms with whitespace using Lucene

Answers (1)

Related Questions