ElHaix
ElHaix

Reputation: 12996

C# Regex to extract all words that begin with numbers and are contained in a specific list?

Given a collection of word combinations, is it possible to return a collection of matched word sets and extract those from a given string using RegEx?

For example, given a car list:

mazda 3
mazda 4
volvo s40

The following text is used:
"I wanted to buy a mazda 3 however I found the volvo s40 to be a much better deal with the 90gv tires."

I how want two different lists from this that should return:

{mazda 3, volvo s40, 90gv} 
{I, wanted, to, buy, a, however, I, found, the, to, be, a, much, better, deal, with, the, tires}

Upvotes: 0

Views: 358

Answers (1)

agent-j
agent-j

Reputation: 27943

This code uses a MatchEvaluator for the matches (the car models), and returns "", so the model gets replaced with empty string. cars is a list of car models. words is a list of the remaining words. I'll leave it to you to handle punctuation appropriately for your needs.

List<string> cars = new List<string>();
string input =
   "I wanted to buy a mazda 3 however I found the volvo s40 to be a much better deal.";
string line = Regex.Replace(
   input, @"\b\w+\s+(?=\S*?\d)(?:\w+)",
   m =>
      {
         cars.Add(m.Value);
         return "";
      });
string [] words = line.Split(' ');

// Ouput the lists:
Console.Write ("Cars:");
foreach (string car in cars)
   Console.Write(car + "    ");
Console.WriteLine ();
Console.Write ("words: ");
foreach (string word in words)
   Console.Write(word + " ");

Produces this output:

Cars:mazda 3    volvo s40
words: I wanted to buy a  however I found the  to be a much better deal.

Upvotes: 1

Related Questions