Reputation: 37436
I'm struggling to find any resources on this online, which is concerning. I've been reading about UCS-2 and UTF-16 woes, but I can't find a solution.
I need to get a value from an input:
var val = $('input').val()
and encode it to base64, treating the text as utf-16, so:
this is a test
becomes:
dABoAGkAcwAgAGkAcwAgAGEAIAB0AGUAcwB0AA==
and not the below, which you get treating it as UTF-8:
dGhpcyBpcyBhIHRlc3Q=
Upvotes: 3
Views: 4340
Reputation: 53735
Your data, once read into JavaScript, will be in an encodingless numerical format (strictly speaking, it has to be in Unicode Normalised Form C, but Unicode is just a series of identifying numbers for each glyph in the Unicode lexicon. It's encoding-less). So: if you specifically need the data encoded as a UTF-16 byte sequence, do so, then base64 encode that.
But here's the fun part: which UTF-16 do you need? Little or Big Endian? With or without BOM? UTF-16 is a really inconvenient encoding format (we're not even going to touch UCS-2. It's obsolete. Has been for a long time).
What you really should need is to get a text value from your HTML element, Base64 encode its value, and then have whatever receives that data unpack it as UTF8; don't try to make JavaScript do more work than it has to. I presume you're sending this data to a server or something, in which case: your server language is way more elaborate than JavaScript, and can unpack text in about a million different encodings thanks to built-in functions. So just use that. Don't solve Y for X.
Upvotes: 1