DanielRH
DanielRH

Reputation: 75

Regex period is matching everything minus the last character

My regex code is the following:

(.(?!\[view street map\]))+

This is meant to match everything up until the [view street map].

But if I use this regex code on the following

Test of the system[view street map]

It matches the following, and cuts off the last character

Test of the syste

Anyone have any idea why this is happening?

Thanks in advance!

Upvotes: 0

Views: 487

Answers (4)

Avinash Raj
Avinash Raj

Reputation: 174706

You must need to add a starting anchor to match everything up until the [view street map] if [view street map] is present on that particular line. If [view street map] is not present then it matches the whole line.

^(?:(?!\[view street map\]).)+

DEMO

Explanation:

  • ^ Anchor which denotes the start of a line.
  • (?:..) Called non-capturing group.
  • (?:(?!\[view street map\]).) Before the regex engine matches the first character, it check if the string following the boundary is [view street map]. If it's [view street map], then it won't match anything. If it's not [view street map], then only it matches the first character. If we add + after the whole non-capturing group, regex engine will do the above step for every character from the start (not only for the first character). + repeats the previous token one or more times.

Upvotes: 2

vks
vks

Reputation: 67968

(.(?!\[view street map\]))+ essentially says match a character and check if [view street map] is ahead or not. mhad [view street map] ahead of it. So it failed. The rest all passed.

You can try:

\[view street map\]|(.)

Grab the captures and concatenate them together to get the string. See demo.

Or try

\[view street map\]

Replace with empty string. See demo.

Upvotes: 0

nhahtdh
nhahtdh

Reputation: 56809

You should do the check before consuming the character:

^((?!\[view street map\]).)+

Consume then check for the disallowed string also erroneously matches the string "[view street map]".

I forgot about anchoring in the previous revision. We need to make sure the match starts from the beginning of the string, or the check can be bypassed when the engine retry matching at the next index.

Upvotes: 2

Ulugbek Umirov
Ulugbek Umirov

Reputation: 12797

Your current logic - take all symbols one-by-one, each one not followed by [view street map]. You can change the logic to the following - take symbols batch followed by [view street map] or end-of-string.

.*?(?=(?:\[view street map\])|$)

Regular expression visualization

Demo

Upvotes: 0

Related Questions