Harish Mohanan
Harish Mohanan

Reputation: 184

Search for occurance of a list in a string on C#

The base is a list of approx 2000 strings. Most of them single word. Some of them two and three words.

Now my query is a string (4 to 9 words). I have to find out which all of these 2000 words or cluster of words appears in this string.

As of now I am using for loop, its working for me, but its taking a lot of time. What is the most effective way of doing it??

Upvotes: 0

Views: 84

Answers (4)

Zaranitos
Zaranitos

Reputation: 101

You can try something like this:

List<string> binOf2000Words = new List<string>
                        {
                            "One",
                            "Two",
                            "Three Four"
                        };
string query = "One Four Three";
var queryLookup = query.Split(' ').ToLookup(v => v, v => v);
var result = binOf2000Words.SelectMany(s => s.Split(' ')).Distinct().Where(w => queryLookup.Contains(w));

Upvotes: 0

Martin Milan
Martin Milan

Reputation: 6390

This should be what you are looking for:

            var binOf2000Words = new List<string>();
            var binOf4To9Words = new List<string)();

            // And at this point you have some code to populate your lists.

            // We now need to cater for the fact that some of the items in the 2000Words bin will actually be strings with more than one word...
            // We'll do away with that by generating a new list that only contains single words.

            binOf2000Words = binOf2000Words.SelectMany(s => s.Split(' ')).Distinct().ToList();

            var result = binOf2000Words.Intersect(binOf4To9Words).Distinct().ToList();

Upvotes: 0

2174714
2174714

Reputation: 288

You can try a HashSet

place your 2000 words into this HashSet, and then use HashSet.Compare

HashSet<string> h = new HashSet<string>();  //load your dictionary here
if (h.Contains(word))
  console.log("Found");

Upvotes: 1

Tim Schmelter
Tim Schmelter

Reputation: 460058

You have to use a loop, there is no other way to process multiple items.

Maybe this is more efficient(difficult to compare without code):

string[] words = your4to9Words.Split();
List<string> appearing = stringList
    .Where(s => s.Split().Intersect(words).Any())
    .ToList();

Upvotes: 1

Related Questions