Marcus
Marcus

Reputation: 151

how to search a file with lucene

I want to do a search for a query within a file "fdictionary.txt" containing a list of words (230,000 words) written line by line. any suggestion why this code is not working? The spell checking part is working and gives me the list of suggestions (I limited the length of the list to 1). what I want to do is to search that fdictionary and if the word is already in there, do not call spell checking. My Search function is not working. It does not give me error! Here is what I have implemented:

public class SpellCorrection {

public static File indexDir = new File("/../idxDir");

public static void main(String[] args) throws IOException, FileNotFoundException, CorruptIndexException, ParseException {

    Directory directory = FSDirectory.open(indexDir);
    SpellChecker spell = new SpellChecker(directory);

    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_20, null);
    File dictionary = new File("/../fdictionary00.txt");
    spell.indexDictionary(new PlainTextDictionary(dictionary), config, true);


    String query = "red"; //kne, console
    String correctedQuery = query; //kne, console

    if (!search(directory, query)) {
        String[] suggestions = spell.suggestSimilar(query, 1);
        if (suggestions != null) {correctedQuery=suggestions[0];}
    }

    System.out.println("The Query was: "+query);
    System.out.println("The Corrected Query is: "+correctedQuery);
}

public static boolean search(Directory directory, String queryTerm) throws FileNotFoundException, CorruptIndexException, IOException, ParseException {
    boolean isIn = false;

    IndexReader indexReader = IndexReader.open(directory);
    IndexSearcher indexSearcher = new IndexSearcher(indexReader);
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_20);

    Term term = new Term(queryTerm);
    Query termQuery = new TermQuery(term);
    TopDocs hits = indexSearcher.search(termQuery, 100);
    System.out.println(hits.totalHits);


    if (hits.totalHits > 0) {
        isIn = true;
    }
    return isIn;
}
}

Upvotes: 1

Views: 1036

Answers (3)

naresh
naresh

Reputation: 2113

where are you indexing the content from fdictionary00.txt?

You can search using IndexSearcher, only when you have index. If you are new to lucene, you might want to check some quick tutorials. (like http://lucenetutorial.com/lucene-in-5-minutes.html)

Upvotes: 1

Hoox
Hoox

Reputation: 1

You could make your SpellChecker instance available in a service and use spellChecker.exist(word).

Be aware that the SpellChecker will not index words 2 characters or less. To get around this you can add them to the index after you have created it (add them into SpellChecker.F_WORD field).

If you want to add to your live index and make them available for exist(word) then you will need to add them to the SpellChecker.F_WORD field. Of course, because you're not adding to all the other fields such as gram/start/end etc then your word will not appear as a suggestion for other misspelled words.

In this case you'd have had to add the word into your file so when you re-create the index it would then be available as a suggestion. It would be great if the project made SpellChecker.createDocument(...) public/protected, rather than private, as this method accomplishes everything with adding words.

After all this your need to call spellChecker.setSpellIndex(directory).

Upvotes: 0

SharpBarb
SharpBarb

Reputation: 1590

You never built the index.

You need to setup the index...

Directory directory = FSDirectory.open(indexDir);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_20);
IndexWriter writer = new IndexWriter(directory,analyzer,true,IndexWriter.MaxFieldLength.UNLIMITED );

You then need to create a document and add each term to the document as an analyzed field..

 Document doc = new Document();
 doc.Add(new Field("name", word , Field.Store.YES, Field.Index.ANALYZED));

Then add the document to the index

writer.AddDocument(doc);

writer.Optimize();

Now build the index and close the index writer.

writer.Commit();
writer.Close();

Upvotes: 0

Related Questions