okan
okan

Reputation: 123

regex doesn't match the word if it's not the last word

i'm trying to write a regex which can match a word in a string with theese conditions:

  1. the word must be 8 character length.
  2. the word must has 1 alphabetic character at any position of the word.
  3. the word must has 7 digits at any position of the word.

\b(?=\w{8}\z)(?=[^a-zA-Z]*[a-zA-Z]{1})(?=(?:[\D]*[\d]){7}).*\b

this can find "123r1234" and "foo 123r1234" but it doesn't find "foo bar 123r1234 foo". i tried to add word boundries but it didn't work. what is wrong with my regex and how can i fix it?

thanks.

Upvotes: 2

Views: 159

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626861

You can use the following regex:

\b(?=[^a-zA-Z]*[a-zA-Z])(?=(?:\D*\d){7})\w{8}\b

See demo

There several things to note here:

  1. It is not necessary to enclose single shorthand classes (like \d) into character classes (pattern becomes too awkward and less readable). Thus, use \D instead of [\D].
  2. The rule of number of look-aheads should equal the number of conditions - 1 (see Fine-Tuning: Removing One Condition at rexegg.com). Most often, length restriction look-aheads with just 1 character/character class are valid candidates for being ported into the base pattern. Here, (?=\w{8}) can easily replace .* at the end.
  3. The (?=\w{8}\z) look-ahead contains an end-of-string \z anchor that forces a match at the end of the string, while you need (as now I know) the end of a word.
  4. [a-zA-Z]{1} is equal to [a-zA-Z] since {1} means *exactly one repetition, and it is redundant (again, regex patterns should be as clean and concise as they can be).

UPDATE (+1 goes to @Jonny5)

There is another way of approaching the current problem: by having the word contain 8 word characters, but matching only 1 letter enclosed with any number of digits. This can be achieved with

(?i)\b(?=\w{8}\b)\d*[a-z]\d*\b

See another demo (Note i modifier is used here)

Upvotes: 3

Marcos Pérez Gude
Marcos Pérez Gude

Reputation: 22158

You can remove last asterisk and change it by the 8 counter.

\b(?=[^a-zA-Z]*[a-zA-Z])(?=(?:[\D]*[\d]){7})\w{8}\b

You can view it running here:

https://regex101.com/r/bX6rK8/1

Upvotes: 2

Related Questions