Reputation: 787
I want to decode utf-8(or unicode) text to normal string.
for example, I want to convert "\uc778\uc0b0\uc544\uc5f0\uc2dc\uba58\ud2b8, \uce58\uba74\uc5f4\uad6c\uc804\uc0c9\uc81c" kind of string to readable text.
I struggled with system.text.utf8encoding text.encoding.utf8.getstring() but it's not working...
How can I solve the problem? It seems that the solution would be simple... If possible, it would be great if you write the code in VB.Net
Thank you for your advice!
Thanks for replying.
I think I didn't write my point clearly.
The question is that I want to convert "\uc885\ud569\uc9c4\ub8cc\uc2e4 \uacac\ud559 / \uce58\uacfc\uc758\uc0ac\uc724\ub9ac \ud1a0\ub860" (unicode 'code', not 'chracter') to a readable string, for example, "가나다라". or chinese or whatever.
and, I need the .NET code to do that.
tried
theString = Convert.toString("\uc885\ud569");
tried
Dim utf8Encoding As New System.Text.UTF8Encoding
Dim encodedString() As Byte
encodedString = utf8Encoding.GetBytes(encodedString) .....
and a few more, but nothing converts "\uc885\ud569" to "가나". (that's an example. I got that each '\u????' code matches a single character, for ex '가')
Thank you!
Upvotes: 0
Views: 1570
Reputation: 245066
I think I finally understand what the problem is. A string like "\uc778\uc0b0"
is exactly the same as "인산"
in C# (and it's UTF-16, not UTF-8). But VB.NET doesn't understand such escape sequences.
I think the best option here would be to write the Koren characters directly, something like "인산"
is valid VB.NET code.
If you really need to use C#-like escape sequences, you can use Regex.Unescape()
:
Dim escaped = "\uc778\uc0b0\uc544\uc5f0\uc2dc\uba58\ud2b8, \uce58\uba74\uc5f4\uad6c\uc804\uc0c9\uc81c"
Dim unescaped = Regex.Unescape(escaped)
Upvotes: 1
Reputation: 17680
You don't have to do anything to convert it.
The text is in chinese characters (or similar asian characters)
Simply output it i guess. worked for me.
I simply did a Console.WriteLine()
from linqpad.
Each of the \uXXXX is a unicode value for a specific character.
Upvotes: 1