Reputation: 561
This is my situation: I have a string representing a text
string myText = "Text to analyze for words, bar, foo";
And a list of words to search for in it
List<string> words = new List<string> {"foo", "bar", "xyz"};
I'd want to know the most efficient method, if exists, to get the list of the words contained in the text, something like that:
List<string> matches = myText.findWords(words)
Upvotes: 10
Views: 6887
Reputation: 8208
Here's a simple solution that accounts for whitespace and punctuation:
static void Main(string[] args)
{
string sentence = "Text to analyze for words, bar, foo";
var words = Regex.Split(sentence, @"\W+");
var searchWords = new List<string> { "foo", "bar", "xyz" };
var foundWords = words.Intersect(searchWords);
foreach (var item in foundWords)
{
Console.WriteLine(item);
}
Console.ReadLine();
}
Upvotes: 0
Reputation: 670
A Regex solution
var words = new string[]{"Lucy", "play", "soccer"};
var text = "Lucy loves going to the field and play soccer with her friend";
var match = new Regex(String.Join("|",words)).Match(text);
var result = new List<string>();
while (match.Success) {
result.Add(match.Value);
match = match.NextMatch();
}
//Result ["Lucy", "play", "soccer"]
Upvotes: 3
Reputation: 149538
You can use a HashSet<string>
and intersect both collections:
string myText = "Text to analyze for words, bar, foo";
string[] splitWords = myText.Split(' ', ',');
HashSet<string> hashWords = new HashSet<string>(splitWords,
StringComparer.OrdinalIgnoreCase);
HashSet<string> words = new HashSet<string>(new[] { "foo", "bar" },
StringComparer.OrdinalIgnoreCase);
hashWords.IntersectWith(words);
Upvotes: 5
Reputation: 9041
Playing off of the idea that you want to be able to use myText.findWords(words)
, you can make an extension method to the String class to do just what you want.
public static class StringExtentions
{
public static List<string> findWords(this string str, List<string> words)
{
return words.Where(str.Contains).ToList();
}
}
Usage:
string myText = "Text to analyze for words, bar, foo";
List<string> words = new List<string> { "foo", "bar", "xyz" };
List<string> matches = myText.findWords(words);
Console.WriteLine(String.Join(", ", matches.ToArray()));
Console.ReadLine();
Results:
foo, bar
Upvotes: 0
Reputation: 32481
There is no special analysis in this query except you have to use Contains
method. So you may try this:
string myText = "Text to analyze for words, bar, foo";
List<string> words = new List<string> { "foo", "bar", "xyz" };
var result = words.Where(i => myText.Contains(i)).ToList();
//result: bar, foo
Upvotes: 9