samnaction
samnaction

Reputation: 1254

Searching contents of files

Okay I am planning to create a local search engine in my intranet which searches the contents of files like xls,xlsx,doc,docx,pdb etc.

After searching in internet I am thinking that Luke Lucene can be used for this. Am I right? Can Lucene be integrated in a Website?

I have around 500 Gb of files can Lucene handle these many files? Is there any alternative?

I know only basics of C and CPP.I dont have any prior knowledge on this. I am a self learner and please suggest me a good book on Lucene.

Upvotes: 0

Views: 62

Answers (1)

Persimmonium
Persimmonium

Reputation: 15771

yes, Lucene can be used for this. But there is some code you need to write yourself (as Lucene is just a library): - crawling code - text extraction - build a searcher app..

so you might be better looking at solr, that is built on top of Lucene, and has many built in features you would use: a solid server you can access with any language and dih for your crawling needs, and tika integration for text extraction, among many other things

Upvotes: 1

Related Questions