Sarp Kaya
Sarp Kaya

Reputation: 3794

Regex to ignore precise match

I want to match everything that starts with & and ends with ; however ignore if and only if string is <

So I've done this &(.+?)[^lt]; but the problem is it ignores &foolt; what am I doing wrong here?

Here is my test case:

&asdas;
&asdasdqwe;
&ltasd;
<
&asdasdlt;
&foolt;

I expect nothing but the 4th one to be ignored. I test it using http://www.regexr.com/

Upvotes: 0

Views: 114

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627380

You regex is using a negated character class [^lt] which means "not l and not t, 1 occurrence exactly".

Here is the regex that will match what you need:

(?!^<$)&(\S+);

Demo is here.

(?!^<$) makes sure the whole string/line is not <

Upvotes: 2

femtoRgon
femtoRgon

Reputation: 33351

To take apart your regex &(.+?)[^lt];:

  • & an ampersand
  • (.+?) then one or more characters (reluctantly).
  • [^lt] then one character, anything but 'l' or 't'.
  • ; then a semicolon

Which isn't quite right. You can check for the 'lt;' with a lookahead, right after the ampersand, like:

&(?!lt;).+?;
  • & an ampersand
  • (?!lt;) be sure that we don't match lt; from this position
  • .+? then one or more characters (reluctantly).
  • ; then a semicolon

Upvotes: 1

Related Questions