Louis
Louis

Reputation: 725

Partial match with Regular Expression

Is there a way to determine that a single char is valid when a regular expression expects a specific number of that char?

I have a WPF custom keyboard and would like to adjust each key's availability based on a regular expression. This will work well when the expression is fairly simple and does not expect specific order of the chars or a specific length to satisfy the pattern.

However, when the pattern becomes more complex and specific, testing a single char against it will always fail.

For instance, given the regular expression [a-zA-Z0-9]{4}

These values will succeed:

The expression clearly expects alphanumerical chars only. I would like a method that given the expression will reject special char, say "%", but accept "a" as "a" is acceptable in [a-zA-Z0-9]. The only issue is the specific length that will not be satisfied.

I am currently using Regex.IsMatch. I guess I am looking for a partial match testing method.

Upvotes: 5

Views: 1775

Answers (1)

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

Sure, you can, but not using the built-in regex engine unfortunately. You can use PCRE instead, which provides the partial matching feature you're asking for.

From the PCRE docs:

In normal use of PCRE, if the subject string that is passed to a matching function matches as far as it goes, but is too short to match the entire pattern, PCRE_ERROR_NOMATCH is returned. There are circumstances where it might be helpful to distinguish this case from other cases in which there is no match.

Consider, for example, an application where a human is required to type in data for a field with specific formatting requirements. An example might be a date in the form ddmmmyy, defined by this pattern:

 ^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$

If the application sees the user's keystrokes one by one, and can check that what has been typed so far is potentially valid, it is able to raise an error as soon as a mistake is made, by beeping and not reflecting the character that has been typed, for example. This immediate feedback is likely to be a better user interface than a check that is delayed until the entire string has been entered. Partial matching can also be useful when the subject string is very long and is not all available at once.

PCRE supports partial matching by means of the PCRE_PARTIAL_SOFT and PCRE_PARTIAL_HARD options, which can be set when calling any of the matching functions. For backwards compatibility, PCRE_PARTIAL is a synonym for PCRE_PARTIAL_SOFT. The essential difference between the two options is whether or not a partial match is preferred to an alternative complete match, though the details differ between the two types of matching function. If both options are set, PCRE_PARTIAL_HARD takes precedence.


But PCRE is a C library... So I've built a PCRE wrapper for .NET.

Usage example from the readme:

var regex = new PcreRegex(@"(?<=abc)123");
var match = regex.Match("xyzabc12", PcreMatchOptions.PartialSoft);
// result: match.IsPartialMatch == true

A little caution though: the wrapper is currently at v0.3, using PCRE v8.36 but PCRE v10.0 was released recently (with a new API), so expect some breaking changes in the API of v0.4 of PCRE.NET. The behavior should stay the same though.

And also, you should be aware of the differences between .NET and PCRE regex flavors. This should not be a problem for most cases though.

Upvotes: 5

Related Questions