luca.p.alexandru
luca.p.alexandru

Reputation: 1750

Regex negative lookahead not working as expected

I have the following regex:

[a-zA-Z0-9. ]*(?!cs)

and the string

Hotfix H5.12.1.00.cs02_ADV_LCR

I want to match only untill

Hotfix H5.12.1.00

But the regex matches untill "cs02"

Shouldn't the negative lookahead have done the job?

Upvotes: 1

Views: 171

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626690

You may consider using a tempered greedy token:

(?:(?!\.cs)[a-zA-Z0-9. ])*

See the regex demo.

This will work regardless of whether .cs is present in the string or not because the tempered greedy token matches any 0+ characters from the [a-zA-Z0-9. ] character class that is not .cs.

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174696

You need to use positive lookahead instead of negative lookahead.

[a-zA-Z0-9. ]*(?=\.cs)

or

[a-zA-Z0-9. ]+(?=\.cs)

Note that your regex [a-zA-Z0-9. ]*(?!cs) is greedy and matches all the characters until it reaches a boundary which isn't followed by cs. See here.

At first pattern [a-zA-Z0-9. ]+ matches Hotfix H5.12.1.00.cs02 greedily because this pattern greedily matches alphabets , dots and spaces. Once it see the underscore char, it stops matching where the two conditions is satisfied,

  1. _ won't get matched by [a-zA-Z0-9. ]+
  2. _ is not cs

It works same for the further two matches also.

Upvotes: 0

Related Questions