Reputation: 3479
I'm newbie lucene user and trying to get some basics now.
I have three files:
apache_empty.txt
(empty file), apache.txt
(contains many of 'apache'
tokens), other.txt
(contains just one token - 'apache'
)When I try to search 'apache'
, I get only apache.txt
and other.txt
in result, but I wanna get even the apache_empty.txt
file, which has the searched word in its name...
And that's how I add documents to the index:
protected Document getDocument(File f) throws Exception
{
Document doc = new Document();
Field contents = new Field("contents", new FileReader(f));
Field parent = new Field("parent", f.getParent(), Field.Store.YES, Field.Index.NOT_ANALYZED);
Field filename = new Field("filename", f.getName(), Field.Store.YES, Field.Index.ANALYZED);
Field fullpath = new Field("fullpath", f.getCanonicalPath(), Field.Store.YES, Field.Index.NOT_ANALYZED);
filename.setBoost(2.0F);
doc.add(contents);
doc.add(parent);
doc.add(filename);
doc.add(fullpath);
return doc;
}
How to let the lucene index also file names?
Upvotes: 4
Views: 2351
Reputation: 68962
To enable wildcards you should search for apache*
which would also match your filename apache_empty
for the complete syntax see also Apache Lucene Query Parser.
An alternative would be to include the underscore as a word separator in the used analyzer.
Upvotes: 6