Reputation: 1
I'm trying to do query with Lucene and I would like to select the documents whose title begins with the "@" character . I looked at the documentation but the results is zero elements. This is the code and the result. Thanks for your help.
this is the code:
IndexWriter w = new IndexWriter(index, config);
addDoc(w, "@aa Lucene in Action", "193398817");
addDoc(w, "@ba Lucene for Dummies", "55320055Z");
addDoc(w, "prova Managing Gigabytes", "55063554A");
addDoc(w, "The Art of Computer Science", "9900333X");
w.close();
String querystring = "@";
Query q;
q = new QueryParser(LUCENE_41, "title", new StandardAnalyzer(LUCENE_41)).parse(querystring);
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(q, 1000000);
ScoreDoc[] hits = docs.scoreDocs;
System.out.println("Found " + hits.length + " hits.");
for (int i = 0; i < hits.length; ++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println((i + 1) + ". " + d.get("isbn") + "\t" + d.get("title"));
}
reader.close();
and this is the output
Building provaLucerne 1.0-SNAPSHOT
------------------------------------------------------------------------
--- exec-maven-plugin:1.2.1:exec (default-cli) @ provaLucerne ---
Found 0 hits.
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 1.505s
Finished at: Wed Nov 02 19:49:39 CET 2016
Final Memory: 5M/155M
Upvotes: 0
Views: 318
Reputation: 3957
You are using StandardAnalyzer which uses StandardTokenizer. "@" character is among the set of token-splitting punctuation, in standard Toeknizer.
so the string "@aa Lucene in Action" is tokenized into "aa","Lucene","in","Action" tokens.
You can use KeywordAnalyzer or WhitespaceAnalyzer and see if that solves your problem.
Upvotes: 0