Aminion
Aminion

Reputation: 465

HttpWebRequest wrong encoding determination

I'm trying to read the html page text from site - http://konungstvo.ru/ , which has utf-8 encoding.

var request = _requestCreator.Create(uri);
try
{
    using (var response = request.GetResponse())
    {
        if (response.ContentType.Contains("text/html"))
        {
            using (var reader = new System.IO.StreamReader(response.GetResponseStream()))
            {
                string responseText = reader.ReadToEnd();
            }

But I'm getting \u001f�\b\01V\u0002X\u0002��X�n\u001b�, and so on, although code works with other sites.

Upvotes: 1

Views: 53

Answers (1)

Jan Köhler
Jan Köhler

Reputation: 6032

I think you need the character encoding for the Latin/Cyrillic alphabet which could by ISO/IEC 8859-5 or e.g. Windows-1251:

var encoding = Encoding.GetEncoding("iso-8859-5");
using (var reader = new System.IO.StreamReader(response.GetResponseStream(), encoding))

Using this while reading the response stream yields some cyrillic content which unfortunately isn't the correct output, too: https://dotnetfiddle.net/x8jnN8. So, I'm sorry but this isn't a real answer to your problem :/

Upvotes: 1

Related Questions