FrostbiteXIII
FrostbiteXIII

Reputation: 891

Javascript Regular Expressions Lookbehind Failing

I am hoping that this will have a pretty quick and simple answer. I am using regular-expressions.info to help me get the right regular expression to turn URL-encoded, ISO-8859-1 pound sign ("%A3"), into a URL-encoded UTF-8 pound sign ("%C2%A3").

In other words I just want to swap %A3 with %C2%A3, when the %A3 is not already prefixed with %C2.

So I would have thought the following would work:

Regular Expression: (?!(\%C2))\%A3
Replace With:       %C2%A3

But it doesn't and I can't figure out why!

I assume my syntax is just slightly wrong, but I can't figure it out! Any ideas?

FYI - I know that the following will work (and have used this as a workaround in the meantime), but really want to understand why the former doesn't work.

Regular Expression: ([^\%C2])\%A3
Replace With:       $1%C2%A3

TIA!

Upvotes: 1

Views: 787

Answers (4)

Rainer Sigwald
Rainer Sigwald

Reputation: 835

Why not just replace ((%C2)?%A3) with %C2%A3, making the prefix an optional part of the match? It means that you're "replacing" text with itself even when it's already right, but I don't foresee a performance issue.

Upvotes: 4

Jason S
Jason S

Reputation: 189696

I would suggest you use the functional form of Javascript String.replace (see the section "Specifying a function as a parameter"). This lets you put arbitrary logic, including state if necessary, into a regexp-matching session. For your case, I'd use a simpler regexp that matches a superset of what you want, then in the function call you can test whether it meets your exact criteria, and if it doesn't then just return the matched string as is.

The only problem with this approach is that if you have overlapping potential matches, you have the possibility of missing the second match, since there's no way to return a value to tell the replace() method that it isn't really a match after all.

Upvotes: 1

Tomalak
Tomalak

Reputation: 338228

You could replace

(^.?.?|(?!%C2)...)%A3

with

$1%C2%A3

Upvotes: 3

David Andres
David Andres

Reputation: 31781

Unfortunately, the (?!) syntax is negative lookahead. To the best of my knowledge, JavaScript does not support negative lookbehind.

What you could do is go forward with the replacement anyway, and end up with %C2%C2%A3 strings, but these could easily be converted in a second pass to the desired %C2%A3.

Upvotes: 4

Related Questions