Reputation: 3371
I'm parsing a file where I want to extract a certain string.
The string will be preceeded by some length of white space, followed by either:
or
followed by a carriage return and newline.
Is it possible for me to make an expression that is equivalent to "if the character is H, then skip 8 characters, else if the character is a G then skip 9 characters" or even more simply "if the character is an H, skip 8 characters, else skip 9 characters".
The current regex I have that works well with H is @"\s+H.{8}(?<user>
.*)\r\n", but I'm stumped when it comes to adding conditional character counts. For instance, it'd be really nice if there were some syntax like [H|G].{8|9}, but I don't think this actually exists in regex syntax.
Upvotes: 2
Views: 159
Reputation: 31656
This does two if conditions. Use Regex option IgnorePatternWhitespace to allow commenting
(?(H0[xX][0-9a-fA-F]{6}[^\r\n\d]+) # If an H with 8 hex digits is found
H.{8}(?<User>[^\r\n\d]+) # Then match the H user
| # else
(?(G0[xX][0-9a-fA-F]{7}[^\r\n\d]+) # If G with 9 hex is found
G.{9}(?<User>[^\r\n\d]+)) # Then match the G User
)
The Achilles heal is that it is unclear what a username consists of...if the username has a digit say 1OmegaMan
...this will fail. But the OP has not specified that rule, nor given any clear examples.
So the assumption here is that a username is all alphabetic characters.
A better pattern to search for might be H\d{8}[A-Z][^\r\n]+
which says that at least one alphabetic character is present after the digits which delineates the username from the digits.
Upvotes: 0
Reputation: 2802
As per my comment, I just elaborated on yours to get
\s+((H.{8})|(G.{9}))(?<user>.*)\r\n
Since Regex corresponds to Finte State Automata, it is easy to see why this is trivial, on reading an H we go in to one state, G in to the other.
Upvotes: 4
Reputation: 73482
Well it is possible with Regex
. You can use conditions in regex.
Here is the main part of "Regex" you're struggling with. I assume you could build with this.
var subject = "H12345678ABC";
var regex = new Regex(@"(?((?<hgroup>H))\k<hgroup>.{8}|.{10})(?<user>.*)");
var match =regex.Match(subject);
if(match.Success)
{
Console.WriteLine(match.Groups["user"].Value);//prints ABC
}
else
{
Console.WriteLine("No Match");
}
Break up:
(?<hgroup>H) Matches H and stores in group hgroup
\k<hgroup>.{8} If true checks matches H followed by any 8 characters
.{10} If not then match next 10 characters(G followed by 9 other characters)
(?<user>.*) Captures rest all to user group
Here is a working Demo
Upvotes: 1