slier
slier

Reputation: 6740

Lookahead confusion

Ok i got this example from Regular Expression Cookbook

^(?=.{3}$).*

The regex above is use to limit the length of an arbitrary pattern

If i test again 'aaabbb', it completely fail

From what i understand it look for any character that precede by any character 3 in length.SO it should match 'bbb' but its not

One more question, should lookbehind follow this pattern x(?=x)

Upvotes: 1

Views: 1191

Answers (2)

Richard JP Le Guen
Richard JP Le Guen

Reputation: 28753

From what i understand it look for any character that precede by any character 3 in length.SO it should match 'bbb' but its not

Nope! Let's take a closer look...

^        # The caret is an anchor which denotes "STARTS WITH"
(?=      # lookahead
   .     # wildcard match; the . matches any non-new-line character
    {3}  # quantifier; exactly 3 times
   $     # dollar sign; I'm not sure if it will act as an anchor but if it did it would mean "THE END"
)        # end of lookbehind
.        # wildcard match; the . matches any non-new-line character
 *       # quantifier; any number of times, including 0 times

Several problems:

  1. The caret requires that the .* be the first characters in the string and then you're trying to lookbehind them for characters sandwhiched between the beginning ^ and the first characters .*.
  2. Your .{3} actually means any three characters, not any character repeated three times ;) You actually want to know How can I find repeated letters with a Perl regex?

Upvotes: 4

Peter O'Callaghan
Peter O'Callaghan

Reputation: 6186

That is actually a lookahead assertion not a lookbehind assertion. The ^ anchors the match at the start of the string, it then asserts that the beginning of the string must be followed by 3 characters followed by the end of the string.

Edit: I should have probably mentioned that the .* at the end is then used to match those three characters since a lookahead assertion doesn't consume any characters.

Upvotes: 6

Related Questions