Richard
Richard

Reputation: 1166

Javascript Regular Expression to match string unless preceeded by backslash

How do I match U1234, but not \U1234 in Javascript?

I can't figure out how to not match the single backslash. The closest I can get is:

\[\\]{0}U[0-9]{4}\b

But that doesn't work. Any suggestions?

Upvotes: 2

Views: 514

Answers (6)

Twisol
Twisol

Reputation: 2772

I would suggest using lookbehind, but JavaScript doesn't seem to support it. Maybe you can match on U[0-9]{4}, find where the match is, and check the character to its left to see if it's a \ or not?

Upvotes: 0

NawaMan
NawaMan

Reputation: 932

Ummm ... Is \^U[0-9]{4}\b works for you?

Upvotes: 0

Kris Kowal
Kris Kowal

Reputation: 3846

JavaScript's RegExp does not support negative look-behind assertions. Ideas that propose you match only /[^\]U/ will match strings like "_U", so that's not the answer. Your best bet is to use two regular expressions, the first to find all occurrences, then the second to filter the look-behind.

"\\U0000 U0000".match(/\\?U[0-9]{4}/g)
.filter(function (match) {
    return !/^\\/.test(match)
})

Upvotes: 0

Michael Krelin - hacker
Michael Krelin - hacker

Reputation: 143081

[^\\]U[0-9]{4} or something along these lines. It will not match the sequence on the very beginning of subject string…

Upvotes: 1

Tim Pietzcker
Tim Pietzcker

Reputation: 336148

JavaScript definitely does not support lookbehind assertions. The next best way to get what you want, in my opinion, would be

(?:^|[^\\])(U[0-9]{4})

Explanation:

(?:          # non-capturing group - if it matches, we don't want to keep it
   ^         # either match the beginning of the string
   |         # or
   [^\\]     # match any character except for a backslash
)            # end of non-capturing group
(U\d{4})     # capturing group number 1: Match U+4 digits

Upvotes: 9

Kornel
Kornel

Reputation: 100110

Unfortunately JS doesn't seem to support proper syntax for this, i.e. back assertion /(?<!\\)U[0-9]{4}/.

So you need to use:

/[^\\]U[0-9]{4}/

This is syntax for regexp literal. If you put regexp in a string, you have to escape backslashes again:

"[^\\\\]U[0-9]{4}"

Upvotes: 0

Related Questions