Marc Andreson
Marc Andreson

Reputation: 3495

Unicode characters string

I have the following String of characters.

string s = "\\u0625\\u0647\\u0644";

When I print the above sequence, I get:

\u0625\u0647\u062

How can I get the real printable Unicode characters instead of this \uxxxx representation?

Upvotes: 22

Views: 25646

Answers (5)

Ruzihm
Ruzihm

Reputation: 20249

Asker posted this as an answer to their question:

I have found the answer:

s = System.Text.RegularExpressions.Regex.Unescape(s);

Upvotes: 3

Hakan Fıstık
Hakan Fıstık

Reputation: 19401

I had the following string "\u0001" and I wanted to get the value of it.
I tried a lot but this is what worked for me

int val = Convert.ToInt32(Convert.ToChar("\u0001")); // val = 1;

if you have multiple chars you can use the following technique

var original ="\u0001\u0002";
var s = "";
for (int i = 0; i < original.Length; i++)
{
    s += Convert.ToInt32(Convert.ToChar(original[i]));
}

// s will be "12"

Upvotes: -1

Joey
Joey

Reputation: 354356

If you really don't control the string, then you need to replace those escape sequences with their values:

Regex.Replace(s, @"\u([0-9A-Fa-f]{4})", m => ((char)Convert.ToInt32(m.Groups[1].Value, 16)).ToString());

and hope that you don't have \\ escapes in there too.

Upvotes: 6

dierre
dierre

Reputation: 7210

I would suggest the use of String.Normalize. You can find everything here:

http://msdn.microsoft.com/it-it/library/8eaxk1x2.aspx

Upvotes: -2

Ria
Ria

Reputation: 10367

Try Regex:

String inputString = "\\u0625\\u0647\\u0644";

var stringBuilder = new StringBuilder();
foreach (Match match in Regex.Matches(inputString, @"\u([\dA-Fa-f]{4})"))
{
    stringBuilder.AppendFormat(@"{0}", 
                               (Char)Convert.ToInt32(match.Groups[1].Value));
}

var result = stringBuilder.ToString();

Upvotes: 1

Related Questions