carck3r
carck3r

Reputation: 317

HttpWebResponse - Encoding problem

I have a problem with encoding. When I get site's source code I have: enter image description here

I set encoding to UTF8 like this:

StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
            string sourceCode = reader.ReadToEnd();

Thanks for your help!

Upvotes: 4

Views: 4544

Answers (4)

Hassan Faghihi
Hassan Faghihi

Reputation: 2021

I had the same issue, I tried changing encoding, from the source to the result, and I got nothing. in the end, I come across a thread that leads me to the following... Take look here... .NET: Is it possible to get HttpWebRequest to automatically decompress gzip'd responses?

you need to use the following code, before retrieving the response from the request.

rqst.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;

since once we use accept-encoding 'gzip' or 'deflate', the data get compressed, and turn into data unreadable by us. so we need to decompress them.

Upvotes: 5

mtmk
mtmk

Reputation: 6316

Try to use the encoding specified:

Encoding encoding;
try
{
    encoding = Encoding.GetEncoding(response.CharacterSet);
}
catch (ArgumentException)
{
    // Cannot determine encoding, use dafault
    encoding = Encoding.UTF8;
}

StreamReader reader = new StreamReader(response.GetResponseStream(), encoding);
string sourceCode = reader.ReadToEnd();

If you are accepting gzip somehow, this may help: (Haven't tried it myself and admittedly it doesn't make much sense since your encoding is not gzip?!)

request.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;

Upvotes: 6

sizar surani
sizar surani

Reputation: 69

Change this line in your code:

using (StreamReader streamReader = new StreamReader(stream, Encoding.GetEncoding(1251)))

it may help you..

Upvotes: 0

Jim Mischel
Jim Mischel

Reputation: 133950

But the response might not be UTF-8. Have you checked the CharacterSet and the ContentType properties of the response object to make sure you're using the right encoding?

In any event, those two characters look like the code page 437 characters for values 03 and 08. It looks like there's some binary data in your data stream.

I would suggest that for debugging, you use Stream.Read to read the first few bytes from the response into a byte array and then examine the values to see what you're getting.

Upvotes: 2

Related Questions