think
think

Reputation: 312

C#: Correct Encoding of escaped Unicode-Characters in a local string (i.e.: not \\u20ac)

C#: I have a string that comes from a webpage's sourcecode:

<script type="text/javascript">
var itemsLocalDeals = [{"category":"HEALTHCARE SERVICES",
"dealPermaLink":"/deals/aachen/NLP-Deutschlandde
5510969","dealPrice":"399,00 \u20ac",..........

I do some things with that string, such as extracting the dealPrice and add it to a List<> (in the whole string are more than one dealPrice).

Is there a method to decode all "\u20ac" to their real character ("€")? There are also other characters, so not only the €-Character has to be decoded.

When I debug my code and look at the local fields/variables the string contains not the "€"-Character but the escaped sequence "\\u20ac".

Something like myString.DecodeUnicodeToRealCharacters.

I'm writing the result to a (UTF-8)result.txt

Thanks alot!

P.S.: Unfortunately .Net 2.0 only...

Upvotes: 0

Views: 4298

Answers (3)

Rob
Rob

Reputation: 1081

public string DecodeUnicodeToRealCharacters(string s)
{
    return Encoding.Unicode.GetString(Encoding.Unicode.GetBytes(s));
}

Upvotes: 1

Mikhail
Mikhail

Reputation: 898

Can you please show the code you're using to write the text? this one works just fine:

string str = "\u20ac";
using (StreamWriter sw = new StreamWriter(@"C:\trythis.txt", false, Encoding.UTF8)){
    sw.Write(str);
}

Upvotes: 0

L.B
L.B

Reputation: 116098

You can use Regex.Unescape("\u20ac");

But better use a json parser since your string seems to be a json string(starting with [{"category":"HEALTHCARE SERVICES",.....)

Upvotes: 3

Related Questions