user2565422
user2565422

Reputation: 237

Regular Expression - Range (00-16)?

I can't seem to get this regex quite right. I need to match a range number range from 00yo to 16yo but exclude any matches past 16.

The regex I am using at the moment is: \b[0-1]?[0-9][\s\S]?yo\b but it does not exclude matches past 16 and will match 50yo.

Please note that I am searching data on a raw hard drive with the data only accessible in a stream. I cannot use ^ or $ at start (the only option is to bookend the regex with a 'not' statement). I am using \b to limit the number of false positive matches. There is more than 1tb of data so I am trying to keep false positives to a minimum and search speed to a maximum.

Examples of a VALID match from 0 to 16 are:

0 yo
0yo
0-yo
0_yo

00 yo
00yo
00-yo
00_yo

7 yo
7yo
7-yo
7_yo

07 yo
07yo
07-yo
07_yo

14 yo
14yo
14-yo
14_yo

Examples of NO match are anything above 16, e.g.:

20 yo
20yo
20-yo
20_yo

I am hoping to keep the joining character (i.e. - or _) as any white-space or non-white space character so that 14>yo would also match.

Any help is much appreciated.

Upvotes: 1

Views: 153

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627102

You need to exclude digits from matching between the number and yo (right now, [\S\s] matches them).

I suggest:

\b(?:1[0-6]|0?[0-9])\D?yo\b

See regex demo

Explanation:

  • \b - word boundary
  • (?:1[0-6]|0?[0-9]) - 2 alternatives:
    • 1[0-6] - 1 followed by a digit from 0 to 6
    • | - or...
    • 0?[0-9] - optional 0 followed by any digit
  • \D? - one or zero non-digit characters (note you can further restrict it by turning it into a negated character class [^\d]?, and add more characters there)
  • yo\b - a whole word yo.

Upvotes: 1

Related Questions