gaffcz
gaffcz

Reputation: 3479

Lucene: how to index file names

I'm newbie lucene user and trying to get some basics now.

I have three files:

When I try to search 'apache', I get only apache.txt and other.txt in result, but I wanna get even the apache_empty.txt file, which has the searched word in its name...

And that's how I add documents to the index:

protected Document getDocument(File f) throws Exception 
{
  Document doc   = new Document();
  Field contents = new Field("contents", new FileReader(f));
  Field parent   = new Field("parent",   f.getParent(), Field.Store.YES, Field.Index.NOT_ANALYZED);
  Field filename = new Field("filename", f.getName(), Field.Store.YES, Field.Index.ANALYZED);
  Field fullpath = new Field("fullpath", f.getCanonicalPath(), Field.Store.YES, Field.Index.NOT_ANALYZED);
  filename.setBoost(2.0F);
  doc.add(contents);
  doc.add(parent);
  doc.add(filename);
  doc.add(fullpath);
  return doc;
}

How to let the lucene index also file names?

Upvotes: 4

Views: 2351

Answers (1)

stacker
stacker

Reputation: 68962

To enable wildcards you should search for apache* which would also match your filename apache_empty for the complete syntax see also Apache Lucene Query Parser.

An alternative would be to include the underscore as a word separator in the used analyzer.

Upvotes: 6

Related Questions