Reputation: 869
I am working on a chatting application in WPF and I want to use emoticons in it. I am working on WPF app. I want to read emoticons which are coming from Android/iOS devices and show respective images.
On WPF, I am getting a black Emoticon looking like . I somehow got a library of emoji icons which are saved with respective hex/escaped unicode values.
So, I want to convert these symbols of emoticons into UTF-32/escaped unicode so that I can directly replace related emoji icons with them.
I had tried to convert an emoticon to its unicode but end up getting a different string with couple of symbols, which are having different unicode.
string unicodeString = "\u1F642"; // represents 🙂
Encoding unicode = Encoding.Unicode;
byte[] unicodeBytes = unicode.GetBytes(unicodeString);
char[] unicodeChars = new char[unicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length)];
unicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0);
string asciiString = new string(unicodeChars);
Any help is appreciated!!
Upvotes: 14
Views: 27258
Reputation: 11
You can simply use @using System.Web
for encoding:
var columndata = "CSR story with emoji 😀"`
columndata = HttpUtility.UrlEncode(columndata);
It will encode the text and emoji.
Here I have text with HTML tags so while decoding I have used Trim()
for decoding:
string titleRaw = HttpUtility.UrlDecode(@Model.columnNamne.ToString().Trim());
If not storing in HTML tags then:
string titleRaw = HttpUtility.UrlDecode(@Model.columnNamne.ToString());
Upvotes: 1
Reputation: 18749
Since C# source files can contain UTF-32 string literals, there is no need to use any encodings for this task.
Example 1.
var rgch = "\U0001F642".ToCharArray();
var str = $"\\u{(ushort)rgch[0]:X4}\\u{(ushort)rgch[1]:X4}";
Result: "\uD83D\uDE42"
    Length of string str
is 12 UTF-16 code points (24 bytes)
Example 2.
var rgch = "\U0001F642".ToCharArray();
var str = rgch[0] + "" + rgch[1];
Result: "🙂"
      Length of string str
is 2 UTF-16 code points (4 bytes)
Upvotes: 2
Reputation: 7440
Your escaped Unicode String is invalid in C#.
string unicodeString = "\u1F642"; // represents 🙂
This piece of code doesnt represent the "slightly smiling face" since C# only respects the first 4 characters - representing an UTF-16 (with 2 Bytes).
So what you actually get is the letter representing 1F64
followed by a simple 2
.
http://www.fileformat.info/info/unicode/char/1f64/index.htm
So this: ὤ2
If you want to type hex with 4 Bytes and get the corresponding string you have to use:
var unicodeString = char.ConvertFromUtf32(0x1F642);
https://msdn.microsoft.com/en-us/library/system.char.convertfromutf32(v=vs.110).aspx
or you could write it like this:
\uD83D\uDE42
This string can than be parsed like this, to get your desired result which is again is the hex value that we started with:
var x = char.ConvertFromUtf32(0x1F642);
var enc = new UTF32Encoding(true, false);
var bytes = enc.GetBytes(x);
var hex = new StringBuilder();
for (int i = 0; i < bytes.Length; i++)
{
hex.AppendFormat("{0:x2}", bytes[i]);
}
var o = hex.ToString();
//result is 0001F642
(The result has the leading Zeros, since an UTF-32 is always 4 Bytes)
Instead of the for Loop you can also use BitConverter.ToString(byte[])
https://msdn.microsoft.com/en-us/library/3a733s97(v=vs.110).aspx the result than will look like:
var x = char.ConvertFromUtf32(0x1F642);
var enc = new UTF32Encoding(true, false);
var bytes = enc.GetBytes(x);
var o = BitConverter.ToString(bytes);
//result is 00-01-F6-42
Upvotes: 22
Reputation: 716
Please be aware that Encoding.Unicode
is UTF-16 in C#. To read 32 bits Unicode, there is this Encoding.UTF32
. Link on MSDN for Encoding.​UT​F32
Upvotes: 1