Reputation: 103
I am not sure how to start solving this problem so any suggestions will be of help.
My client has a number of static HTML pages running into hundreds of files. These under go updates every now and then and are overwritten on the website. We list these pages on the website via a simple left hand side explorer mimicking the folder structure in which these files are given to us.
We now want to give the ability to search these files and display matching results. Doing a brute search through such a large number of files is going to be very time consuming. Matching related words (for example plurals, misspellings etc) is also desirable. Showing results in the order of popularity would be a useful feature. I am not sure how to get started on this. Should we pre-process the html files after every update for instance? Any recommended indexing libraries available in .NET? What little programming has been done on the website has been done using C#.
Thanks MS
Upvotes: 1
Views: 604
Reputation: 897
I´d first write a simple program to transfer all those files contents to a database. Then you could implement your search properly without having to read all files every time.
Upvotes: 1
Reputation: 66
I am not sure if its within your budget, but Google can do it for you as user1161318 pointed out.
Try Google Site Search - http://www.google.co.uk/enterprise/search/products_gss.html
Upvotes: 0