C# Regex to extract after capture group numbers only

Question

I'm not sure what I'm doing wrong. I have the following:

(?:[A-Z]{2}\d{2}\s)

This is because my string always starts with two upper alpha characters and 2 numeric. Afterwards I have data that is mixed with words and I only want the numbers I want to take this AB12 (1,2,3 words, 4,5,6,7,8,9) and obtain this AB12 (1,2,3,4,5,6,7,8,9)

I was trying

(?:[A-Z]{2}\d{2}\s)([0-9]+)

however this is not working. Was I even close in achieving my goal?

Wiktor Stribiżew · Accepted Answer

To remove any character that is not a digit and a comma, you can use the [^,\d\s] character class, and use (?<=$[^()]*) and (?=[^()]*$) lookarounds to assert the position inside parentheses:

(?<=$[^()]*)\s*[^,\d]+(?=[^()]*$)

See the regex demo

The \s* helps get rid of optional (0+) whitespaces before non-numerical values.

If you need to precise the context with your initial subpattern, add it:

(?<=^[A-Z]{2}\d{2}\s+$[^()]*)\s*[^,\d]+(?=[^()]*$)
    ^^^^^^^^^^^^^^^^^

A C# demo:

using System;
using System.IO;
using System.Text.RegularExpressions;

public class Test
{
    public static void Main()
    {
        var str = "AB12 (1,2,3 words, 4,5,6,7,8,9)";
        var pat = @"(?<=^[A-Z]{2}\d{2}\s+$[^()]*)\s*[^,\d]+(?=[^()]*$)";
        var res = Regex.Replace(str, pat, string.Empty);
        Console.WriteLine(res);
    }
}

C# Regex to extract after capture group numbers only

Answers (1)

Related Questions