Reputation: 2543
I'm using HtmlAgilityPack to fetch some meta-data off of some websites. However, a lot of websites has their meta data content saved with ISO-8857-1 encoding, so I get strings like:
Alt sammen under ét tag. Kontakt os i dag på
Being somewhat of an encoding beginner, I'm at a complete loss as to how to get the regular UTF-8 encoded string. I'ved tried with a procedure like this:
Encoding.GetEncoding("iso-8859-1").GetString(Encoding.UTF8.GetBytes(input));
which just gives me an even more obscure string. Can someone point me in the right direction? Even stack overflow converts the iso-8859-1 charcters to the correct ones, when I write them inside the quote blocks.
Upvotes: 0
Views: 533
Reputation: 3853
Are you looking for
"Alt sammen under ét tag. Kontakt os i dag på"
as output?
In that case you might be confusing character encoding with html encoding, which is yet another layer of encoding on top of the page character encoding.
If this is the case, use system.web.httputility.htmldecode to get the string as "human-readable".
Upvotes: 3