Pinco Pallino
Pinco Pallino

Reputation: 1016

Regular expression to match non escaped character

I need to regex match the keyword's VALUE and the optional TYPE where the possible conditions are the following.

In the examples below I curly braced the fields that I need to capture. SOMEVALUEXXX is always expected to be there although there might be rare cases where the value is an null/empty string. TYPE=XXXX is an optional argument and might be not there.

The closest regular expression I was able to produce is the following: ^ANNIVERSARY(?:(?::)|(?:;.*:))([^:]*)$ which will capture ANNIVERSARY's VALUE but will fail to do so when there's an escaped colon (/:) in the value.

SOMEKEYWORD:{SOMEVALUE}

SOMEKEYWORD:{SOMEVALUE/:WITHCOLONESCAPED}

SOMEKEYWORD:{SOMEVALUE/:WITHSEMICOLONESCAPED}

SOMEKEYWORD;TYPE={SOMETYPE}:{SOMEVALUE}

SOMEKEYWORD;TYPE={SOMETYPE}:{SOMEVALUE/:WITHCOLONESCAPED}

SOMEKEYWORD;TYPE={SOMETYPE}:{SOMEVALUE/;WITHSEMICOLONESCAPED}

SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE}:{SOMEVALUE}

SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE}:{SOMEVALUE/:WITHCOLONESCAPED}

SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE}:{SOMEVALUE/;WITHSEMICOLONESCAPED}

SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE};ARG2=MYARG2:{SOMEVALUE}

SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE};ARG2=MYARG2:{SOMEVALUE/:WITHCOLONESCAPED}

SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE};ARG2=MYARG2:{SOMEVALUE/;WITHSEMICOLONESCAPED}

Upvotes: 1

Views: 159

Answers (1)

goTo-devNull
goTo-devNull

Reputation: 9372

Left the curly braces intact for demonstration, but even when removed also gives the desired results:

var testing = new string[]
{
    "SOMEKEYWORD:{SOMEVALUE}",
    "SOMEKEYWORD:{SOMEVALUE/:WITHCOLONESCAPED}",
    "SOMEKEYWORD:{SOMEVALUE/;WITHSEMICOLONESCAPED}",
    "SOMEKEYWORD;TYPE={SOMETYPE}:{SOMEVALUE}",
    "SOMEKEYWORD;TYPE={SOMETYPE}:{SOMEVALUE/:WITHCOLONESCAPED}",
    "SOMEKEYWORD;TYPE={SOMETYPE}:{SOMEVALUE/;WITHSEMICOLONESCAPED}",
    "SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE}:{SOMEVALUE}",
    "SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE}:{SOMEVALUE/:WITHCOLONESCAPED}",
    "SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE}:{SOMEVALUE/;WITHSEMICOLONESCAPED}",
    "SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE};ARG2=MYARG2:{SOMEVALUE}",
    "SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE};ARG2=MYARG2:{SOMEVALUE/:WITHCOLONESCAPED}",
    "SOMEKEYWORD;ARG1=MYARG1;TYPE={SOMETYPE};ARG2=MYARG2:{SOMEVALUE/;WITHSEMICOLONESCAPED}"
};

// tried to use the fewest number of capture groups for readability
var regex = new Regex(
    @"
        (
            (TYPE=(?<type>[^;]+);[^:]*?)        
            | 
            (TYPE=(?<type>.*?))
        )?
        :
        (?<value>.*)$
    ",
    RegexOptions.Compiled 
    | RegexOptions.IgnoreCase
    | RegexOptions.IgnorePatternWhitespace
);

foreach (var test in testing)
{
    Match match = regex.Match(test);
    Console.Write(
        "type: [{0}] || value: [{1}]\n",
        match.Groups["type"].Value,
        match.Groups["value"].Value
    );
}

If case matters remove RegexOptions.IgnoreCase.

Upvotes: 1

Related Questions