emekslondon
emekslondon

Reputation: 11

How can I convert a html text to utf-8 with C#

How can I convert my input value

It’s time for Events This Weekend. Browse through and see which events are happening around you. Have fun Ciao! LoudNProudLive Series: ‘Hit Makers’ Special Edition LoudNProudLive Series presents a ‘Hit Makers’ Special Edition with headliners Tolu (of Project Fame), Simi and Oyinkanade. Date: Thursday, April 30, 2015 Time: 8 PM Venue: ELIAS (Ocean Bay Mall), […]

to a human readable sentence in utf-8. I tried the code below and this was what I got. -

It�s time for Events This Weekend. Browse through and see which events are happening around you. Have fun Ciao! LoudNProudLive Series: �Hit Makers� Special Edition LoudNProudLive Series presents a �Hit Makers� Special Edition with headliners Tolu (of Project Fame), Simi and Oyinkanade. Date: Thursday, April 30, 2015 Time: 8 PM Venue: ELIAS (Ocean Bay Mall), […]

 //convert html to utf-8
    private static string cleanUpCodes(string value)
    {
        //convert from iso to utf-8
        Encoding iso = Encoding.GetEncoding("windows-1252");
        Encoding utf8 = Encoding.UTF8;
        byte[] isoBytes = iso.GetBytes(value);
        byte[] utf8Bytes = Encoding.Convert(utf8, iso, isoBytes);
        string msg = utf8.GetString(utf8Bytes);

        //convert to real html
        msg = HttpUtility.HtmlDecode(msg);

        return msg;
    }

Upvotes: 1

Views: 5573

Answers (1)

Charles Mager
Charles Mager

Reputation: 26213

Presumably this has been decoded using the wrong encoding, hence the weird text. In this case, you don't want to Convert between encodings, you just want to get your bytes back and have another go:

For example:

var bytes = Encoding.Default.GetBytes(value);
var result = Encoding.UTF8.GetString(bytes);

That gives this, which is pretty close:

It’s time for Events This Weekend. Browse through and see which events are happening around you. Have fun Ciao! LoudNProudLive Series: ‘Hit Makers’ Special Edition LoudNProudLive Series presents a ‘Hit Makers’ Special Edition with headliners Tolu (of Project Fame), Simi and Oyinkanade. Date: Thursday, April 30, 2015 Time: 8 PM Venue: ELIAS (Ocean Bay Mall), [�]

I'd be inclined to get to the source of the problem though - how did you get this string?

Upvotes: 4

Related Questions