Reputation: 8378
I'm trying to start using lucene. The code, I'm using to index documents is:
public void index(String type, String words) {
IndexWriter indexWriter = null;
try {
if (dir == null)
dir = createAndPropagate();
indexWriter = new IndexWriter(dir, new StandardAnalyzer(), true,
new KeepOnlyLastCommitDeletionPolicy(),
IndexWriter.MaxFieldLength.UNLIMITED);
Field wordsField = new Field(FIELD_WORDS, words, Field.Store.YES,
Field.Index.ANALYZED);
Field typeField = new Field(FIELD_TYPE, type, Field.Store.YES,
Field.Index.ANALYZED);
Document doc = new Document();
doc.add(wordsField);
doc.add(typeField);
indexWriter.addDocument(doc);
indexWriter.commit();
} catch (IOException e) {
logger.error("Problems while adding entry to index.", e);
} finally {
try {
if (indexWriter != null)
indexWriter.close();
} catch (IOException e) {
logger.error("Unable to close index writer.", e);
}
}
}
The search looks like this:
public List<TagSearchEntity> searchFor(final String type, String words,
int amount) {
List<TagSearchEntity> result = new ArrayList<TagSearchEntity>();
try {
if (dir == null)
dir = createAndPropagate();
for (final Document doc : searchFor(dir, type, words, amount)) {
@SuppressWarnings("serial")
TagSearchEntity searchResult = new TagSearchEntity() {{
setType(type);
setWords(doc.getField(FIELD_WORDS).stringValue());
}};
result.add(searchResult);
}
} catch (IOException e) {
logger.error("Problems while searching", e);
}
return result;
}
private List<Document> searchFor(Directory indexDirectory, String type,
String words, int amount) throws IOException {
Searcher indexSearcher = new IndexSearcher(indexDirectory);
final Query tagQuery = new TermQuery(new Term(FIELD_WORDS, words));
final Query typeQuery = new TermQuery(new Term(FIELD_TYPE, type));
@SuppressWarnings("serial")
BooleanQuery query = new BooleanQuery() {{
add(tagQuery, BooleanClause.Occur.SHOULD);
add(typeQuery, BooleanClause.Occur.MUST);
}};
List<Document> result = new ArrayList<Document>();
for (ScoreDoc scoreDoc : indexSearcher.search(query, amount).scoreDocs) {
result.add(indexSearcher.doc(scoreDoc.doc));
}
indexSearcher.close();
return result;
}
I've got two use cases. The first one adds document of some type, then searches for it, then adds document of another type, then searches for it, etc. The other one adds all documents, then searches for them. The first one works fine:
@Test
public void testSearch() {
search.index("type1", "test type1 for test purposes test test");
List<TagSearchEntity> result = search.searchFor("type1", "test", 10);
assertNotNull("Retrieved list should not be null.", result);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
search.index("type2", "test type2 for test purposes test test");
result.clear();
result = search.searchFor("type2", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
search.index("type3", "test type3 for test purposes test test");
result.clear();
result = search.searchFor("type3", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
}
But the other one seems to be only indexing the last document:
@Test
public void testBuggy() {
search.index("type1", "test type1 for test purposes test test");
search.index("type2", "test type2 for test purposes test test");
search.index("type3", "test type3 for test purposes test test");
List<TagSearchEntity> result = search.searchFor("type3", "test", 10);
assertNotNull("Retrieved list should not be null.", result);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
result.clear();
result = search.searchFor("type2", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
result.clear();
result = search.searchFor("type1", "test", 10);
assertTrue("Retrieved list should not be empty.", !result.isEmpty());
}
It successfully finds type3
, but fails to find all the others. If I shullfle those calls around, it will still successfully find only the last indexed document.
Lucene version, I'm using is:
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>2.4.1</version>
</dependency>
<dependency>
<groupId>lucene</groupId>
<artifactId>lucene</artifactId>
<version>1.4.3</version>
</dependency>
What am I doing wrong? How to make it index all documents?
Upvotes: 0
Views: 156
Reputation: 5042
A new index is getting created after every index operation. The third argument is the create
flag and it is being set to true. As per the documentation of IndexWriter, if this flag is set, it will either create a new index or overwrite the existing one. Set it to false to append to the existing index.
Upvotes: 2