Kerfluffel
Kerfluffel

Reputation: 147

How to regex - get alpha from beginning OR end of alpha string (and numeric)

I'm not overly strong in regex and trying to determine if this is possible...

I have data coming in alphanumeric which could be defined a number of ways:

100
NW100
W100
100NW
100W

(may not always be W/NW - just an example of letters)

I would like to split the data into variables to hold the alpha and numeric separately. Using regex groups, I have this so far:

        var numAlpha = new Regex("(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)");
       
        var startMatch = numAlpha.Match(myDataString);
        var alpha = startMatch.Groups["Alpha"].Value;
        int number = int.Parse(startMatch.Groups["Numeric"].Value);

This doesn't work for all cases, particularly when the alpha is at the end (e.g. 100W).

I would like it to check the beginning AND end of the string for the alphabetic part and still store it into the same alpha variable.

I was able to create a third group named Alpha2

(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)(?<Alpha2>[a-zA-Z]*)

but would like to keep it to two groups (and two variables) if it is possible. Any advice appreciated.

Upvotes: 1

Views: 287

Answers (2)

The fourth bird
The fourth bird

Reputation: 163362

In .NET you can use an alternation | with the same group names:

(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)|(?<Numeric>[0-9]*)(?<Alpha>[a-zA-Z]*)

Note that [a-zA-Z]* and [0-9]* can both match 0 or more times and so can also match an empty string.

.NET regex demo

If you don't want to match empty strings, you can make the quantifier 1 or more times and only have the alphanumerics optional in the second part.

(?<Alpha>[a-zA-Z]+)(?<Numeric>[0-9]+)|(?<Numeric>[0-9]+)(?<Alpha>[a-zA-Z]*)

.NET Regex demo

enter image description here

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

You can use a conditional construct:

(?<Alpha>[a-zA-Z]+)?(?<Numeric>\d+)(?(Alpha)|(?<Alpha>[a-zA-Z]+)?)

See the .NET regex demo. Details:

  • (?<Alpha>[a-zA-Z]+)? - an optional group "Alpha" matching one or more ASCII letters
  • (?<Numeric>\d+) - Group "Numeric" matching one or more digits
  • (?(Alpha)|(?<Alpha>[a-zA-Z]+)?) - a conditional: if Group "Alpha" matched, do nothing, else, capture one or more ASCII letters into Group "Alpha".

Upvotes: 1

Related Questions