Kinexus
Kinexus

Reputation: 12904

Variable length colon seperated string regex validation

I am trying to write a regex to validate and extract the values from a colon separated string that can have 1-4 values. I have found example where there are a fixed number of variables and tried to use this but it only picks up the first and last values, I need to extract all of them. The current regex is also including the : in the match, I simply want the value if possible

I am currently using this;

^([01ab])+(\:[01ab])*

but it only pulls the first and last values, not those in between if they exist.

Valid values;

0

0:a

0:a:1

0:1:a:b

Not valid

0:a:

0:a:1:b:

Upvotes: 2

Views: 163

Answers (3)

Tim Rutter
Tim Rutter

Reputation: 4679

A not using regex approach (and why would you use regex unless you really have to) is this:

bool Validate(string s)
{
    string[] valid = {"0", "1", "a", "b"};
    var splitArray = s.Split(':');

    if (splitArray.Length < 1 || splitArray.Length > 4)
          return false;

    return splitArray.All(a => valid.Contains(a));
}

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627103

I suggest a two-step approach: validate the format with the regex and then split the string with : if it qualifies:

if (Regex.IsMatch(text, @"^[01ab](?::[01ab])*$")) 
{
    result = text.Split(':');
}

The ^[01ab](?::[01ab])*$ regex matches start of a string with ^, a 0, 1, a or b, and then 0 or more repetitions of : followed with a 0, 1, a or b and then end of string ($).

If you want to play with the regex a bit you will see that C# allows you to access all capture group values via CaptureCollection:

var text = "0:1:a:b";
var results = Regex.Match(text, @"^(?:([01ab])(?::\b|$))+$")?
        .Groups[1].Captures.Cast<Capture>().Select(c => c.Value);
Console.WriteLine(string.Join(", ", results)); // => 0, 1, a, b

See the C# demo and the regex demo.

Regex details

  • ^ - start of string
  • (?:([01ab])(?::\b|$))+ - 1 or more repetitions of:
    • ([01ab]) - Group 1: 0, 1, a or b
    • (?::\b|$) - either : followed with a letter, digit (\b will also allow _ to follow, but it is missing in the pattern) or end of string
  • $ - end of string.

Upvotes: 2

jdweng
jdweng

Reputation: 34421

It is more efficient to use a string method than regex. So try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication137
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] inputs = { "0", "0:a", "0:a:1", "0:1:a:b", "Not valid", "0:a:", "0:a:1:b:" };

            foreach (string input in inputs)
            {
                string[] splitArray = input.Split(new char[] { ':' }, StringSplitOptions.RemoveEmptyEntries).ToArray();

                if (splitArray.Length < 2)
                {
                    Console.WriteLine("Input: '{0}' Not Valid", input);
                }
                else
                {
                    Console.WriteLine("Input: '{0}' First Value : '{1}', Last Value : '{2}'", input, splitArray[0], splitArray[splitArray.Length - 1]);
                }
            }
            Console.ReadLine();

        }
    }
}

Upvotes: -1

Related Questions