Mick
Mick

Reputation: 727

Need C# Regex to match a four digit sequence, but ignore any single digits peceeding

OK, I need to improve this question. Let me try this again:

I need to parse out a flight time which comes after an airport code, but may have a single digit and white space between the two.

Example data:

ORD  1100
HOU 1 1215
MAD   4  1300

I tried this:

([A-Z]{3})\s?\d?\s?(\d{4})

I end up with the airport code and a single digit. I need a regex that will ignore everything after the airport code except the 4 digit flight time.

Hope I improved my question.

Upvotes: 2

Views: 2814

Answers (3)

LTAcosta
LTAcosta

Reputation: 590

This is the answer I would use:

@"([A-Z]{3})\s+(?:[0-9]\s+)?([0-9]{4})"

Basically it is very similar to what you were attempting to do.

The first part is ([A-Z]{3}), which looks for 3 uppercase letters and assigns them to group 1 (Group 0 is the entire string).

The second part is \s+(?:[0-9]\s+)?, which requires at least one space, with the possibility of 1 digit in there somewhere. The noncapturing group in the middle requires that if there is a single digit there, it must be followed by at least 1 space. This prevents a mismatch for something like ABC 12345.

Next we have ([0-9]{4}), which simply matched the 4 digits you are looking for. These can be found in group 2. I use [0-9] here since \d refers to more digits than what we are used to (Like Eastern Arabic numerals).

Upvotes: 1

Patrik Westerlund
Patrik Westerlund

Reputation: 464

Here's a little something, using lookbehind and lookahead to be sure there are only 4 digits, with non-digits (or beginning/end) surrounding them.

"(?<=[^\d]|^)\d{4}(?=[^\d]|$)"

The two [^\d] can be replaced with [\s] to only match 4-digits with whitespace around them.

Update: With your latest update, I merged my regex with yours (from the comment) and came up with this:

"(?<=[A-Z]{3}\s(\d\s)?)\d{4}(?=\s|$)"

There are three parts to the pattern. First is the lookbehind: (?<=PatternHere). The pattern inside this must occur/match before what we seek.

The next part is our simple main pattern: \d{4}, four digits.

The last part is the lookahead: (?=PatternHere), which is pretty much the same as lookbehind, but checks the other side, forward.

Upvotes: 1

Michal Klouda
Michal Klouda

Reputation: 14521

The solution might be as simple as:

\d{4}

According to your inputs you don't need to care about preceeding digits..

Upvotes: 2

Related Questions