user8645307
user8645307

Reputation:

Convert regex matches to the list of strings

I'm trying to find equal sub-string in big list about 50 000 strings, this way fine:

var results = myList.FindAll(delegate (string s) { return s.Contains(myString); });

but it also looks for sub-string with part of word, for example, if I'm looking for "you do" it founds also extra "you dont" because contains "you do..".

So, this answer to my previous question supposedly should work as i need, but I'm not sure, how to get strings list from regex matches for particular code:

foreach (string phrase in matchWordsList)
{
     foreach (string str in bigList)
     {
          string[] stringsToTest = new[] { phrase };
          var escapedStrings = stringsToTest.Select(s => Regex.Escape(s)); 
          var regex = new Regex("\\b(" + string.Join("|", escapedStrings) + ")\\b");
          var matches = regex.Matches(str);

          foreach (string result in matches) /// Incorrect: System.InvalidCastException 
          {
              resultsList.Add(result);
          }
     }
}

Getting strings from matches directly to the list throws exception:

An unhandled exception of type 'System.InvalidCastException' occurred in test.exe

Additional information: Unable to cast object of type 'System.Text.RegularExpressions.Match' to type 'System.String'.

So, I'm trying to figure out, hot to convert var matches = regex.Matches(str); to the list

Upvotes: 0

Views: 3296

Answers (3)

ProgrammingLlama
ProgrammingLlama

Reputation: 38767

I may have misunderstood what you were trying to do in your previous question.

Would this work? It combines your "matchWordsList" into a single expression, and then adds each match from bigList into resultsList:

var escapedStrings = matchWordsList.Select(s => Regex.Escape(s)); 
var regex = new Regex("\\b(" + string.Join("|", escapedStrings) + ")\\b");
foreach (string str in bigList)
{
    if (regex.IsMatch(str))
    {
        resultsList.Add(str);
    }
}

So if matchWordsList contains ["test","words","cheese"], and str is "This is a test to check if Regex is matching words. I like cheese.", it will add str to resultsList once (even though there are 3 matches).

Try it online

Upvotes: 0

TheGeneral
TheGeneral

Reputation: 81493

You can do it with linq. However you will need to Cast it first then Select

var resultsList = regex.Matches(str)
                       .Cast<Match>()
                       .Select(m => m.Value)
                       .ToList();

or

someList.AddRange(
   regex.Matches(str)
         .Cast<Match>()
         .Select(m => m.Value));

Upvotes: 2

Kirill Polishchuk
Kirill Polishchuk

Reputation: 56162

Simply use Match type in foreach loop:

foreach (Match result in matches)
{
    resultsList.Add(result.Value);
}

Upvotes: 0

Related Questions