theqs1000
theqs1000

Reputation: 377

Searching a directory for a keyword efficiently in C#

I'm trying to think of the most efficient way to search a directory full of text files (possibly 2000 files around 150 lines each) for a keyword. If I was just searching for one keyword then performance wouldn't be so much of an issue, but in my application I want to be able to search for a different keyword at a later point, possibly multiple times. So iterating over the entire file collection each time seems time consuming. And storing everything in memory seems quite memory expensive too.

What would be the best way to do this? I don't have access to an SQL database or anything like that, so I can't temporarily dump the contents into a database and search that periodically; it's just going to be a regular Windows application.

The most primitive approach I can think of is to dump all of the files into one huge XML file and search that - rather than iterating through all of the files in the directory each time a keyword search happens. But even this seems like it could be quite time intensive?

I will know the directory name in advance, so I can pre-process the contents - if this could possibly help in-so-far as optimisation.

Any suggestions are welcome, thanks.

Upvotes: 4

Views: 625

Answers (2)

DerApe
DerApe

Reputation: 3175

As "L.B" stated, you can use Lucene.net for creating an inverted index. It is a .Net implmentation from a java library. Lucene on apache.org

This is a small example how to do it.

Upvotes: 0

Roy Dictus
Roy Dictus

Reputation: 33139

Why not use a cmd utility that you call from C#?

The findstr utility in DOS can do what you need and it is efficient: http://technet.microsoft.com/en-us/library/bb490907.aspx

How to call it from C#: How To: Execute command line in C#, get STD OUT results

Good luck!

Upvotes: 3

Related Questions