accand
accand

Reputation: 561

Check if a string contains a list of substrings and save the matching ones

This is my situation: I have a string representing a text

string myText = "Text to analyze for words, bar, foo";   

And a list of words to search for in it

List<string> words = new List<string> {"foo", "bar", "xyz"};

I'd want to know the most efficient method, if exists, to get the list of the words contained in the text, something like that:

List<string> matches = myText.findWords(words)

Upvotes: 10

Views: 6887

Answers (5)

user2023861
user2023861

Reputation: 8208

Here's a simple solution that accounts for whitespace and punctuation:

static void Main(string[] args)
{
    string sentence = "Text to analyze for words, bar, foo";            
    var words = Regex.Split(sentence, @"\W+");
    var searchWords = new List<string> { "foo", "bar", "xyz" };
    var foundWords = words.Intersect(searchWords);

    foreach (var item in foundWords)
    {
        Console.WriteLine(item);
    }

    Console.ReadLine();
}

Upvotes: 0

Javier
Javier

Reputation: 670

A Regex solution

var words = new string[]{"Lucy", "play", "soccer"};
var text = "Lucy loves going to the field and play soccer with her friend";
var match = new Regex(String.Join("|",words)).Match(text);
var result = new List<string>();

while (match.Success) {
    result.Add(match.Value);
    match = match.NextMatch();
}

//Result ["Lucy", "play", "soccer"]

Upvotes: 3

Yuval Itzchakov
Yuval Itzchakov

Reputation: 149538

You can use a HashSet<string> and intersect both collections:

string myText = "Text to analyze for words, bar, foo"; 
string[] splitWords = myText.Split(' ', ',');

HashSet<string> hashWords = new HashSet<string>(splitWords,
                                                StringComparer.OrdinalIgnoreCase);
HashSet<string> words = new HashSet<string>(new[] { "foo", "bar" },
                                            StringComparer.OrdinalIgnoreCase);

hashWords.IntersectWith(words);

Upvotes: 5

Shar1er80
Shar1er80

Reputation: 9041

Playing off of the idea that you want to be able to use myText.findWords(words), you can make an extension method to the String class to do just what you want.

public static class StringExtentions
{
    public static List<string> findWords(this string str, List<string> words)
    {
        return words.Where(str.Contains).ToList();
    }
}

Usage:

string myText = "Text to analyze for words, bar, foo";
List<string> words = new List<string> { "foo", "bar", "xyz" };
List<string> matches = myText.findWords(words);
Console.WriteLine(String.Join(", ", matches.ToArray()));
Console.ReadLine();

Results:

foo, bar

Upvotes: 0

Hossein Narimani Rad
Hossein Narimani Rad

Reputation: 32481

There is no special analysis in this query except you have to use Contains method. So you may try this:

string myText = "Text to analyze for words, bar, foo";

List<string> words = new List<string> { "foo", "bar", "xyz" };

var result = words.Where(i => myText.Contains(i)).ToList();
//result: bar, foo

Upvotes: 9

Related Questions