jcvandan
jcvandan

Reputation: 14314

Writing full-text search algorithm in C# / Entity Framework - where to start?

I need to search a potentially large collection of sentences, and I have no idea where to start.

In summary a user will submit a search phrase, for example "how do I delete my account", I then need to go to the db and do a match with the words provided.

At the moment I am thinking of doing something like the following:

Could anyone point me in the right direction? Also if anyone knows any libraries for doing this sort of work that would be great.

Cheers

Upvotes: 9

Views: 4225

Answers (3)

Marcin Deptuła
Marcin Deptuła

Reputation: 11957

As for prioritizing words, simple but pretty effective solution is to sort them by their popularity (maybe popularity index could be create based on articles in your database), so that words that are rare in your texts are more important, this way you can boost words that are less general.

Other problem here is the fact, that you might have words in different forms, like past/future tense, therefore you might be interested in stemming them, one tool that was ported to c# is Snowball project as far as I remember.

As for doing second part of your problem, looping through words might be very ineffective, I think you should consider using some indexing libraries / solutions. One, popular for .net is Lucene.Net. It basically creates reversed index, which maps certain phrases (like words) to articles that contain them, which allows you to quickly find all occurrences of given words in your texts. Similar approached could be implemented by yourself inside your database

Upvotes: 6

jcvandan
jcvandan

Reputation: 14314

Just in case anyone comes across this and wondered what I used in the end, I ended up using Lucene.NET. It's fantastic, very easy to set up and use considering it so powerful and adds such great functionality. One thing I would say though is that the documentation isn't great. However, I did find a series of tutorials here that is a good introduction. I spent a morning going through these articles and I had ridiculously fast full text indexing/searching in my app!

Upvotes: 3

Ladislav Mrnka
Ladislav Mrnka

Reputation: 364269

Use SQL server full text search capability and wrap the query using full text search to stored procedure. Execute the stored procedure either through ADO.NET or EF.

Upvotes: 2

Related Questions