Reputation: 312
C#: I have a string that comes from a webpage's sourcecode:
<script type="text/javascript">
var itemsLocalDeals = [{"category":"HEALTHCARE SERVICES",
"dealPermaLink":"/deals/aachen/NLP-Deutschlandde
5510969","dealPrice":"399,00 \u20ac",..........
I do some things with that string, such as extracting the dealPrice and add it to a List<> (in the whole string are more than one dealPrice).
Is there a method to decode all "\u20ac" to their real character ("€")? There are also other characters, so not only the €-Character has to be decoded.
When I debug my code and look at the local fields/variables the string contains not the "€"-Character but the escaped sequence "\\u20ac".
Something like myString.DecodeUnicodeToRealCharacters.
I'm writing the result to a (UTF-8)result.txt
Thanks alot!
P.S.: Unfortunately .Net 2.0 only...
Upvotes: 0
Views: 4298
Reputation: 1081
public string DecodeUnicodeToRealCharacters(string s)
{
return Encoding.Unicode.GetString(Encoding.Unicode.GetBytes(s));
}
Upvotes: 1
Reputation: 898
Can you please show the code you're using to write the text? this one works just fine:
string str = "\u20ac";
using (StreamWriter sw = new StreamWriter(@"C:\trythis.txt", false, Encoding.UTF8)){
sw.Write(str);
}
Upvotes: 0
Reputation: 116098
You can use Regex.Unescape("\u20ac");
But better use a json parser since your string seems to be a json string(starting with [{"category":"HEALTHCARE SERVICES",.....
)
Upvotes: 3