luciano
luciano

Reputation: 13850

Regular expression with positive lookahead/lookbehind

How I can match all the below strings with a single regular expression?

This is the regular expression I've tried: (?<=.+)site(?=.+)

Note that simpler regular expressions might do the job, but the whole point of this is to learn that what the (?<=.+) and (?=.+) parts of the regular expression do.

locationAsite1
locationAsiteNumber1
locationAsiteNumber01
locationAsite01
locationBsite.01
locationB.site.02
(locationB)site.02
<locationB>site<03>s
..locationB..site<03>

Upvotes: 2

Views: 496

Answers (2)

Rakholiya Jenish
Rakholiya Jenish

Reputation: 3223

Positive lookbehind means that the expression in (<= ) must be matched.

For example here if you say (?<=A)site the it will select those site with has A before it. Though A won't be selected by the regex, it only ensures that A comes before the site.


Positive lookahead is same as positive lookbehind just that the expression must be followed after the match.

Example: writing site(?=1) will match those site which has 1 immediately followed after it. As positive lookbehind it will not select 1 but only will ensure that those matches of site are made that actually are site1.


Your case of (?<=.)site(?=.) is not good for example since it will match all the site in your input.

So using (?<=A)site(?=0) will only match the site in the line locationAsite01 since site has A preceding it and 0 following it.

Upvotes: 3

Avinash Raj
Avinash Raj

Reputation: 174834

Your regex may also written as,

(?<=.)site(?=.)

which means, the string site must be preceded and followed by atleast one character.

Most languages won't support variable length lookbehind except the C# family.

(?<=.+)site(?=.+)

means the substring site must be preceded and followed by one or more chars. That is, it would match the string site only if it's at the middle not if it's present at the start or at the end.

Upvotes: 5

Related Questions