MikeDub
MikeDub

Reputation: 5283

c# Match multiple regex groups, keeping each match/word seperate

Regex is currently doing my head in atm and unable to figure out how to obtain multiple match groups from a single string, while keeping the word results separate.

For E.g., I have a string 'GeForce TURBO-GTX1080-8G NVIDIA'.

I want to find each set of numbers in the string (I know \d achieves a decimal match). Ie. 1080 and 8 (or if I could find them with their surround text as well, that would be better (GTX1080 & 8G).

I then want to be able to pull these out of the match and compare them with another string, but I want each word / match separate for comparison.

Ie. I want to run the same match against a different string such as 'GeForce TURBO-GTX1070-4G', (which will return 1070 and 4 etc.) and compare the matches against each other.

If I seem to use groups, ie. (\d)(\d\s), the matches don't seem to be successful.

I've looked around for answers and saw posts such as this...

Multiple Group Matches

And played around with regexstorm.net, however still having issues making that extra step.

Can anyone please shed some light on this?

Upvotes: 3

Views: 2797

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626802

You seem to seek

GeForce\s+\w+-(\w+)-(\w+)

The regex demo is available here.

Pattern explanation:

  • GeForce - a literal substring GeForce
  • \s+ - 1 or more whitespaces
  • \w+- - 1+ word chars and a hyphen
  • (\w+) - Group 1 capturing 1+ word chars
  • - - a hyphen -(\w+) - Group 2 capturing 1+ word chars

To access the groups, use Match.Groups[X].Value.

C# demo:

var re = @"GeForce\s+\w+-(\w+)-(\w+)"; 
var str = "GeForce TURBO-GTX1080-8G NVIDIA\nGeForce TURBO-GTX1070-4Gi"; 
var res = Regex.Matches(str, re)
        .Cast<Match>()
        .Select(m => m.Groups.Cast<Group>().Skip(1).Select(g => g.Value) )
        .ToList();
foreach (var m in res)
    Console.WriteLine(string.Join(" : ", m));

If you also need to match digits there, use

GeForce\s+\w+-([^\W\d]*(\d+)[^\W\d]*)-([^\W\d]*(\d+)[^\W\d]*)

See this regex demo. The code will be the same as above.

Here, \w+ are replaced with [^\W\d]*(\d+)[^\W\d]* that match:

  • [^\W\d]* - zero or more word chars except digits (that is, [\p{L}_], or [\w-[\d]])
  • (\d+) - Group X that captures one or more digits
  • [^\W\d]* - ibid.

enter image description here

Upvotes: 3

Related Questions