hayfreed
hayfreed

Reputation: 547

Lucene API Query always returns first dictionary result

I am just learning how to use the Lucene Java API and I set up a small example for proof of concept. Whenever I send in a query, it gives me one result: the first item in the dictionary, no matter what the query is.

I have 89834 terms in my dictionary, here are the first 5.

Field name: "drug"
"cysteine"
"glycine"
"dihydroxyacetone"
"glycerone"
"arginine"
...

A search for "arginine" returns the following:

Found 1 results
"cysteine"

Anything I put in returns cysteine as the only result.

Here's the code.

def luceneTest(cxn: RepositoryConnection)
{
    //build the index
    val analyzer: Analyzer = new StandardAnalyzer()
    val indexPath: Path = Paths.get("lucene/model1")
    val directory: Directory = FSDirectory.open(indexPath)
    val config: IndexWriterConfig = new IndexWriterConfig(analyzer)
    val iwriter: IndexWriter = new IndexWriter(directory, config)
    val doc = addCSVtoLuceneIndex("lucene_dictionary.csv")
    iwriter.addDocument(doc)
    iwriter.close()

    //query the index
    val ireader: DirectoryReader = DirectoryReader.open(directory)
    val isearcher: IndexSearcher = new IndexSearcher(ireader)
    val parser: QueryParser = new QueryParser("drug", analyzer)
    val query: Query = parser.parse("arginine")
    val hits: Array[ScoreDoc] = isearcher.search(query, 10).scoreDocs
    logger.info("found " + hits.size + " results.")
    for (a <- hits)
    {
        val hitdoc: Document = isearcher.doc(a.doc)
        logger.info(hitdoc.get("drug"))
    }
    ireader.close()
    directory.close()
}

def addCSVtoLuceneIndex(dictionary: String): Document =
    {
        val doc: Document = new Document()
        val br: BufferedReader = new BufferedReader(new FileReader(dictionary))
        var index = 1
        for (b <- 1 to 83984)
        {
            var line = br.readLine()
            var strArray: Array[String] = line.split(",")
            var strToAdd = ""
            for (a <- 1 to strArray.length - 1) strToAdd += strArray(a)
            doc.add(new Field("drug", strToAdd, TextField.TYPE_STORED))
            //logger.info("added " + strToAdd)
        }
        doc
    }

Upvotes: 0

Views: 52

Answers (1)

Evgeny
Evgeny

Reputation: 1770

You iterate over your file in addCSVtoLuceneIndex but put everything in one lucene document. Suppose you wanted to have document per line.

And are you sure you want to put each line from file as is just without commas?

Upvotes: 1

Related Questions