NikolaiDante
NikolaiDante

Reputation: 18639

Extract comma separated portion of string with a RegEx in C#

Sample data: !!Part|123456,ABCDEF,ABC132!!

The comma delimited list can be any number of any combination of alphas and numbers

I want a regex to match the entries in the comma separated list:

What I have is: !!PART\|(\w+)(?:,{1}(\w+))*!!

Which seems to do the job, the thing is I want to retrieve them in order into an ArrayList or similar so in the sample data I would want:

The code I have is:

string partRegularExpression = @"!!PART\|(\w+)(?:,{1}(\w+))*!!"
Match match = Regex.Match(tag, partRegularExpression);
ArrayList results = new ArrayList();

foreach (Group group in match.Groups)
{
    results.Add(group.Value);
}

But that's giving me unexpected results. What am I missing?

Thanks

Edit: A solution would be to use a regex like !!PART\|(\w+(?:,??\w+)*)!! to capture the comma separated list and then split that as suggested by Marc Gravell

I am still curious for a working regex for this however :o)

Upvotes: 2

Views: 6877

Answers (4)

Martin Brown
Martin Brown

Reputation: 25310

I think the RegEx you are looking for is this:

(?:^!!PART\|){0,1}(?<value>.*?)(?:,|!!$)

This can then be run like this

        string tag = "!!Part|123456,ABCDEF,ABC132!!";

        string partRegularExpression = @"(?:^!!PART\|){0,1}(?<value>.*?)(?:,|!!$)";
        ArrayList results = new ArrayList();

        Regex extractNumber = new Regex(partRegularExpression, RegexOptions.IgnoreCase);
        MatchCollection matches = extractNumber.Matches(tag);
        foreach (Match match in matches)
        {
            results.Add(match.Groups["value"].Value);
        }            

        foreach (string s in results)
        {
            Console.WriteLine(s);
        }

Upvotes: 1

ICR
ICR

Reputation: 14162

You can either use split:

string csv = tag.Substring(7, tag.Length - 9);
string[] values = csv.Split(new char[] { ',' });

Or a regex:

Regex csvRegex = new Regex(@"!!Part\|(?:(?<value>\w+),?)+!!");
List<string> valuesRegex = new List<string>();
foreach (Capture capture in csvRegex.Match(tag).Groups["value"].Captures)
{
    valuesRegex.Add(capture.Value);
}

Upvotes: 3

ZombieSheep
ZombieSheep

Reputation: 29953

The following code

string testString = "!!Part|123456,ABCDEF,ABC132!!";
foreach(string component in testString.Split("|!,".ToCharArray(),StringSplitOptions.RemoveEmptyEntries) )
{
    Console.WriteLine(component);
}

will give the following output

Part
123456
ABCDEF
ABC132

This has the advantage of making the comma separated part of the string match up with the index numbers you (possibly accidentally incorrectly) specified in the original question (1,2,3).

HTH

-EDIT- forgot to mention, this may have drawbacks if the format of each string is not as expected above, but then again it would break just as easily without stupendously complex regex too.

Upvotes: 0

Marc Gravell
Marc Gravell

Reputation: 1062550

Unless I'm mistaken, that still only counts as one group. I'm guessing you'll need to do a string.Split(',') to do what you want? Indeed, it looks a lot simpler to not bother with regex at all here... Depending on the data, how about:

        if (tag.StartsWith("!!Part|") && tag.EndsWith("!!"))
        {
            tag = tag.Substring(7, tag.Length - 9);
            string[] data = tag.Split(',');
        }

Upvotes: 1

Related Questions