Belmark Caday
Belmark Caday

Reputation: 1683

Lookaround assertions in Perl

im confused what is the use of these lookaround assertions in perl?

example this one:

(?=pattern)

or the positive lookahead. So here's my questions:

  1. How are these useful? what sort of instances they are used?
  2. And related to question 1, why would i want to look ahead of the regex pattern? isnt it more work? looking ahead and then executing the pattern matching again.

I need a very clear example if possible. Thanks

Upvotes: 4

Views: 939

Answers (5)

ikegami
ikegami

Reputation: 386406

To uppercase what's in between commas, you could use:

(my $x = 'a,b,c,d,e') =~ s/(?<=,)([^,]*)(?=,)/ uc($1) /eg;   # a,B,C,D,e

                  a,b,c,d,e
Pass 1 matches      -
Pass 2 matches        -
Pass 3 matches          -

If you didn't use lookarounds, this is what you'd get,

(my $x = 'a,b,c,d,e') =~ s/,([^,]*),/ ','.uc($1).',' /eg;   # a,B,c,D,e

                  a,b,c,d,e
Pass 1 matches     ---
Pass 2 matches         ---

Not only does the lookahead avoid repetition, it doesn't work without it!


Another somewhat common use is as part of a string equivalent to [^CHAR].

foo(?:(?!foo|bar).)*bar  # foo..bar, with no nested foo or bar

You can use it to narrow down character classes.

\w(?<!\d)     # A word char that's not a digit.

Although this can now be done using (?[ ... ]).


It's also useful in more esoteric patterns.

 /a/ && /b/ && /c/

can be written as

 /^(?=.*?a)(?=.*?b).*?c/s

Upvotes: 4

doubleDown
doubleDown

Reputation: 8408

Lookaround assertions is useful when you need a pattern to help locate the match but you don't want the pattern to be part of what is captured.

Here's a simple scenario with lookahead assertion:

Let's say I have

my $text = '98 degrees, 99 Red Balloons, 101 Dalmatians'

and I want to change the number of red balloons from its previous value to 9001, so I use

$text =~ s/\d+(?=Red Balloons)/9001/;

Upvotes: 0

amon
amon

Reputation: 57640

There are many reasons to use lookarounds, e.g.

  1. limiting the substring that is considered to be matched: s/(?<=[0-9])+(?=[0-9])/-/ instead of s/([0-9])+([0-9])/$1-$2/.
  2. and-ing various conditions together: /(?=\p{Uppercase}\p{Lowercase})\p{InBasicLatin}{2,}/.

Upvotes: 0

Andrew Cheong
Andrew Cheong

Reputation: 30283

I have found lookaheads especially useful for checking multiple conditions. For example, consider a regex that checks that a password has at least one lowercase, one uppercase, one numeric, and one symbol character, and is at least 8 characters in length:

^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[^a-zA-Z0-9]).{8,}$

Try to devise a regex to do the same thing without lookahead assertions! It's possible, but it's extremely cumbersome.

Meanwhile, I've found lookbehinds especially useful for checking boundary conditions—that is, for example, matching a string of 0's, unless it's preceded by another number, like 1000067.

These are my experiences but certainly there are many more practical uses and the way everyone uses a tool can vary from person to person.

Upvotes: 1

Anirudha
Anirudha

Reputation: 32807

lookahead lets you check for a pattern without actually matching it.

When you do a(?=b) ,you would match a if its followed by b. Note:it doesn't match b.


So,

1>You can extract hello(without #) from #hello# using

(?<=#)hello(?=#)

2>You can validate passwords with requirements such as a password must have 2 digits,2 letters or more with any other character

^(?=(.*\d){2})(?=(.*[a-z]){2}).*$

Try doing above without lookahead ,you would realize it's importance

Upvotes: 3

Related Questions