Reputation: 184
The base is a list of approx 2000 strings. Most of them single word. Some of them two and three words.
Now my query is a string (4 to 9 words). I have to find out which all of these 2000 words or cluster of words appears in this string.
As of now I am using for loop, its working for me, but its taking a lot of time. What is the most effective way of doing it??
Upvotes: 0
Views: 84
Reputation: 101
You can try something like this:
List<string> binOf2000Words = new List<string>
{
"One",
"Two",
"Three Four"
};
string query = "One Four Three";
var queryLookup = query.Split(' ').ToLookup(v => v, v => v);
var result = binOf2000Words.SelectMany(s => s.Split(' ')).Distinct().Where(w => queryLookup.Contains(w));
Upvotes: 0
Reputation: 6390
This should be what you are looking for:
var binOf2000Words = new List<string>();
var binOf4To9Words = new List<string)();
// And at this point you have some code to populate your lists.
// We now need to cater for the fact that some of the items in the 2000Words bin will actually be strings with more than one word...
// We'll do away with that by generating a new list that only contains single words.
binOf2000Words = binOf2000Words.SelectMany(s => s.Split(' ')).Distinct().ToList();
var result = binOf2000Words.Intersect(binOf4To9Words).Distinct().ToList();
Upvotes: 0
Reputation: 288
You can try a HashSet
place your 2000 words into this HashSet, and then use HashSet.Compare
HashSet<string> h = new HashSet<string>(); //load your dictionary here
if (h.Contains(word))
console.log("Found");
Upvotes: 1
Reputation: 460058
You have to use a loop, there is no other way to process multiple items.
Maybe this is more efficient(difficult to compare without code):
string[] words = your4to9Words.Split();
List<string> appearing = stringList
.Where(s => s.Split().Intersect(words).Any())
.ToList();
Upvotes: 1