thethanghn
thethanghn

Reputation: 329

HttpUtility.HtmlDecode cannot decode ASCII greater than 127

I have a list of character that display fine in WebBrowser in the form of encoded characters such as €  ... But when posting these characters onto server to I realized that HttpUtility.HtmlDecode cannot convert them to characters as browser did, they all become space.

text = System.Web.HttpUtility.HtmlDecode("€");

I expect it to return € but it return space instead. The same thing happen for some other characters as well.

Does anyone know how to fix this or any workaround?

Upvotes: 2

Views: 1934

Answers (3)

Slion
Slion

Reputation: 3048

You typically want to do something like:

string html = "€"
string trash = WebUtility.HtmlDecode(html);
//Convert from default encoding to UTF8
byte[] bytes = Encoding.Default.GetBytes(trash);
string proper = Encoding.UTF8.GetString(bytes);

Upvotes: 0

Sören Kuklau
Sören Kuklau

Reputation: 19930

ASCII is 7-Bit; there are no characters 128 through 255. The MSDN article you linked is following the long tradition of pretending ASCII is 8-Bit; the article actually shows code page 437.

I'm not sure why you're not simply writing € (compatibility?), but € or € should do, too.

Upvotes: 0

Aliostad
Aliostad

Reputation: 81660

This is commonly result of using literal values and mixing UTF-8 and ASCII. In UTF-8 euro sign is encoded as 3 bytes so there is no ASCII counterpart for it.

Update

Your code is illegal if you are using UTF-8 since it only supports the first 128 characters and the rest are encoded is multiple bytes. You need to use the Unicode syntax:

  // !!! NOT HtmlDecode!!!
  text = System.Web.HttpUtility.UrlDecode("%E2%82%AC");

UPDATE

OK, I have left the code as it was but added the comment that it does not work. It does not work because it is not an encoding which is of concern for HTML - it is not an HTML. This is of concern for the URL and as such you need to use UrlDecode instead.

Upvotes: 1

Related Questions