Reputation: 181
I have a Unicode string from a text file such that. And I want to display the real character.
For example:
\u8ba1\u7b97\u673a\u2022\u7f51\u7edc\u2022\u6280\u672f\u7c7b
When read this string from text file, using StreamReader.ReadToLine()
, it escape the \
to '\\'
such as "\\u8ba1"
, which is not wanted.
It will display the Unicode string same as from text. Which I want is to display the real character.
"\\u8ba1"
to "\u8ba1"
in the result string. Upvotes: 18
Views: 11088
Reputation: 191
If you are utilizing Newtosoft.JSON the solution is quite simple:
var s = @"\u8ba1\u7b97\u673a\u2022\u7f51\u7edc\u2022\u6280\u672f\u7c7b";
s = Newtonsoft.Json.JsonConvert.DeserializeObject<string>("\"" + s + "\"");
Upvotes: 0
Reputation: 2933
This question came out in the first result when googling, but I thought there should be a simpler way... this is what I ended up using:
using System.Text.RegularExpressions;
//...
var str = "Ingl\\u00e9s";
var converted = Regex.Unescape(str);
Console.WriteLine($"{converted} {str != converted}"); // Inglés True
Upvotes: 5
Reputation: 217351
If you have a string like
var input1 = "\u8ba1\u7b97\u673a\u2022\u7f51\u7edc\u2022\u6280\u672f\u7c7b";
// input1 == "计算机•网络•技术类"
you don't need to unescape anything. It's just the string literal that contains the escape sequences, not the string itself.
If you have a string like
var input2 = @"\u8ba1\u7b97\u673a\u2022\u7f51\u7edc\u2022\u6280\u672f\u7c7b";
you can unescape it using the following regex:
var result = Regex.Replace(
input2,
@"\\[Uu]([0-9A-Fa-f]{4})",
m => char.ToString(
(char)ushort.Parse(m.Groups[1].Value, NumberStyles.AllowHexSpecifier)));
// result == "计算机•网络•技术类"
Upvotes: 28