Hanoi
Hanoi

Reputation: 101

Need to escape non-ASCII characters in JavaScript

Is there any function to do the following?

var specialStr = 'ipsum áá éé lore';
var encodedStr = someFunction(specialStr);
// then encodedStr should be like 'ipsum \u00E1\u00E1 \u00E9\u00E9 lore'

I need to encode the characters that are out of ASCII range, and need to do it with that encoding. I don't know its name. Is it Unicode maybe?

Upvotes: 10

Views: 10721

Answers (4)

Jens
Jens

Reputation: 1659

This works for me. Specifically when using the Dropbox REST API:

   encodeNonAsciiCharacters(value: string) {
        let out = ""
        for (let i = 0; i < value.length; i++) {
            const ch = value.charAt(i);
            let chn = ch.charCodeAt(0);
            if (chn <= 127) out += ch;
            else {
                let hex = chn.toString(16);
                if (hex.length < 4)
                    hex = "000".substring(hex.length - 1) + hex;
                out += "\\u" + hex;
            }
        }
        return out;
    }

Upvotes: 1

Max Murphy
Max Murphy

Reputation: 1973

If you need hex encoding rather than unicode then you can simplify @Domenic's answer to:

"aäßåfu".replace(/./g, function(c){return c.charCodeAt(0)<128?c:"\\x"+c.charCodeAt(0).toString(16)})

returns: "a\xe4\xdf\xe5fu"

Upvotes: 3

Domenic
Domenic

Reputation: 112917

This should do the trick:

function padWithLeadingZeros(string) {
    return new Array(5 - string.length).join("0") + string;
}

function unicodeCharEscape(charCode) {
    return "\\u" + padWithLeadingZeros(charCode.toString(16));
}

function unicodeEscape(string) {
    return string.split("")
                 .map(function (char) {
                     var charCode = char.charCodeAt(0);
                     return charCode > 127 ? unicodeCharEscape(charCode) : char;
                 })
                 .join("");
}

For example:

var specialStr = 'ipsum áá éé lore';
var encodedStr = unicodeEscape(specialStr);

assert.equal("ipsum \\u00e1\\u00e1 \\u00e9\\u00e9 lore", encodedStr);

Upvotes: 20

fmsf
fmsf

Reputation: 37177

Just for information you can do as Domenic said or use the escape function but that will generate unicode with a different format (more browser friendly):

>>> escape("áéíóú");
"%E1%E9%ED%F3%FA"

Upvotes: 1

Related Questions