Reputation: 547
I am just learning how to use the Lucene Java API and I set up a small example for proof of concept. Whenever I send in a query, it gives me one result: the first item in the dictionary, no matter what the query is.
I have 89834 terms in my dictionary, here are the first 5.
Field name: "drug"
"cysteine"
"glycine"
"dihydroxyacetone"
"glycerone"
"arginine"
...
A search for "arginine" returns the following:
Found 1 results
"cysteine"
Anything I put in returns cysteine as the only result.
Here's the code.
def luceneTest(cxn: RepositoryConnection)
{
//build the index
val analyzer: Analyzer = new StandardAnalyzer()
val indexPath: Path = Paths.get("lucene/model1")
val directory: Directory = FSDirectory.open(indexPath)
val config: IndexWriterConfig = new IndexWriterConfig(analyzer)
val iwriter: IndexWriter = new IndexWriter(directory, config)
val doc = addCSVtoLuceneIndex("lucene_dictionary.csv")
iwriter.addDocument(doc)
iwriter.close()
//query the index
val ireader: DirectoryReader = DirectoryReader.open(directory)
val isearcher: IndexSearcher = new IndexSearcher(ireader)
val parser: QueryParser = new QueryParser("drug", analyzer)
val query: Query = parser.parse("arginine")
val hits: Array[ScoreDoc] = isearcher.search(query, 10).scoreDocs
logger.info("found " + hits.size + " results.")
for (a <- hits)
{
val hitdoc: Document = isearcher.doc(a.doc)
logger.info(hitdoc.get("drug"))
}
ireader.close()
directory.close()
}
def addCSVtoLuceneIndex(dictionary: String): Document =
{
val doc: Document = new Document()
val br: BufferedReader = new BufferedReader(new FileReader(dictionary))
var index = 1
for (b <- 1 to 83984)
{
var line = br.readLine()
var strArray: Array[String] = line.split(",")
var strToAdd = ""
for (a <- 1 to strArray.length - 1) strToAdd += strArray(a)
doc.add(new Field("drug", strToAdd, TextField.TYPE_STORED))
//logger.info("added " + strToAdd)
}
doc
}
Upvotes: 0
Views: 52
Reputation: 1770
You iterate over your file in addCSVtoLuceneIndex
but put everything in one lucene document. Suppose you wanted to have document per line.
And are you sure you want to put each line from file as is just without commas?
Upvotes: 1