Reputation: 17952
jQuery.ajax()
is doing something weird when escaping my data.
For example, if I send the request:
$.ajax({
url: 'somethinguninteresting',
data: {
name: 'Ihave¬aweirdcharacter';
}
});
then investigate the XHR in Chrome devtools, it shows the "Request Payload" as name=Ihave%C2%ACaweirdcharacter
Now, I've figured out that:
'¬'.charCodeAt(0) === 172
and that 172 is AC
in hexadecimal.
Working backwards, C2
(the "extra" character being prepended) in hexadecimal is 194 in decimal, and
String.fromCharCode(194) === 'Â'
Why does
encodeURIComponent('¬')
return '%C2%AC'
, which would appear to be the result of calling
encodeURIComponent('¬')
(which itself returns '%C3%82%C2%AC'
)?
Upvotes: 4
Views: 1425
Reputation: 173572
Although JavaScript uses UTF-16 (or UCS-2) internally, it performs URI encoding based on UTF-8.
The ordinal value of 172 is encoded in two bytes, because it can no longer be represented by ASCII; two-byte encoding in UTF-8 is done this way:
110xxxxx 10xxxxxx
In the place of x
we fill in the binary representation of 172, which is 10101100:
11000010 10101100 = C2AC
^^^
pad
This outcome is then percent encoded to finally form %C2%AC
which is what you saw in the request payload.
Upvotes: 2
Reputation: 5643
Url encoding (or percent encoding), encodes unicode characters using UTF-8. UTF-8 encodes characters with varying numbers of bytes. The ¬ character is encoded in UTF-8 as C2 AC
.
The charCodeAt
method does not handle multi-byte sequences. See this answer https://stackoverflow.com/a/18729931/4231110 for more details on how to use charCodeAt to encode a string with UTF-8.
In short, %C2%AC
is the correct percent encoding of ¬. This can be demonstrated by running
decodeURIComponent('%C2%AC') // '¬'
Upvotes: 0