Yahoo
Yahoo

Reputation: 4187

Indexing Files and searching them through web Application

I have a shared folder on the LAN Which contains lots of PDF and DOC's . I want to create a web page in php / Asp so that some how i can index those PDF's and Doc , and search the content in them.

The search result will show me the relevant results. How can i do that ? Whats the procedure ? Can this be done ?

Upvotes: 1

Views: 3809

Answers (3)

kta
kta

Reputation: 20130

You can make a tree dynamically(any recursive tree algorithm will do) which will reflect the actual document structure and you can show the tree on a web page.

In order to show the tree on web page you can use jquery/ yui tree structure from PHP.

On the bottom of the tree, you have files. When user clicks it you can show the file content on the browser.

If your folder structure has too many levels may be you can use some cache so that you don't need to create the dynamic tree when required.

Hope this helps..mate..all the best.

Upvotes: 0

Richard Benson
Richard Benson

Reputation: 1487

All Windows server versions include an easy to use indexing service that you can access programatically without installing any third party software at all. This can index almost any kind of document you can think of either natively or through third party iFilters.

If you are using server 2003 or below, it's probably already installed on your server, enter computer management and it will be listed under "Services and Applications". If you are on 2008 then add it to the file services role under "Windows Server 2003 File Services".

Once installed, follow this guide to create a new index.

It should index Office documents out of the box, if not you can download the full Office iFilter pack for free. To index PDF files, you only need to install Adobe Reader on the server and the iFilter will be installed alongside it.

You can now develop your own search pages in ASP to query the index. It's supremely powerful and super fast, plus it obeys NTFS permissions so you can safely index all your files knowing that (as long as you use Windows Authentication is IIS) that the searching user will only see listed files that they have permission to access.

We use indexing server in all our offices with huge success, I will see if I am allowed to Open Source our solution, but all the code is out there anyway.

You can use our solution as a base and should give you full access to your files.

Upvotes: 1

mamoo
mamoo

Reputation: 8166

You can choose among several solutions, all of them basically require you to implement a system in which there are:

1) A search engine

2) A (web) client

Maybe the more suitable solution is to use Solr as engine and PHP as client. You can find a kick-start tutorial here:

http://www.ibm.com/developerworks/opensource/library/os-php-apachesolr/

Upvotes: 1

Related Questions