יוסי פיבקו
יוסי פיבקו

Reputation: 39

Convert PHP RegEx to Javascript

I have this PHP regex for remove utf8 special characters from strings:

[\x00-\x1F]|\xC2[\x80-\x9F]|\xE2[\x80-\x8F]{2}|\xE2\x80[\xA4-\xA8]|\xE2\x81[\x9F-\xAF]

I need to convert it to Javascript regex. I tryed this code:

str = str.replace(/[\x00-\x1F]|\xC2[\x80-\x9F]|\xE2[\x80-\x8F]{2}|\xE2\x80[\xA4-\xA8]|\xE2\x81[\x9F-\xAF]/g, '');

But it does nothing.

I need your help. Thank you.

Upvotes: 0

Views: 606

Answers (2)

Bradley Weston
Bradley Weston

Reputation: 425

Simple mistake, big effect:

strTest = strTest.replace(/your regex here/g, "$1");
// ----------------------------------------^

without the "global" flag, the replace occurs for the first match only.

Side note: To remove any character that does not fulfill some kind of complex condition, like falling into a set of certain Unicode character ranges, you can use negative lookahead:

var regex = /(?![\x00-\x7F]|[\xC0-\xDF][\x80-\xBF]|[\xE0-\xEF][\x80-\xBF]{2}|[\xF0-\xF7][\x80-\xBF]{3})./g;
strTest = strTest.replace(regex, "")

where regex reads as

(?!      # negative look-ahead: a position *not followed by*:
  […]    #   any allowed character range from above
)        # end lookahead
.        # match this character (only if previous condition is met!)

Upvotes: 3

Kazekage Gaara
Kazekage Gaara

Reputation: 15052

Try this :

str = str.replace(/[\x00-\x1F]|\xC2[\x80-\x9F]|\xE2[\x80-\x8F]{2}|\xE2\x80[\xA4-\xA8]|\xE2\x81[\x9F-\xAF]/gi, '');

Upvotes: 0

Related Questions