Reputation: 439
I have simple code for getting response from websites, but there is one small problem. I am trying to get response from Russian website and from one website I unknown symbols and from other I get normal text. Where might be problem ?
Response from: www.kinopoisk.ru
������ � ����...
Response from: www.yandex.ru
Греция - Чехия. 1:2...
HttpWebRequest http = (HttpWebRequest) HttpWebRequest.Create("http://");
http.Timeout = 30000;
http.KeepAlive = true;
http.ContentType = "application/x-www-form-urlencoded";
http.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0";
http.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
http.Proxy = null;
WebResponse response = http.GetResponse();
Stream istream = response.GetResponseStream();
StreamReader reader = new StreamReader(istream);
Response.Write(reader.ReadToEnd());
reader.Close();
Upvotes: 2
Views: 8150
Reputation: 3177
This is a charset issue. If you want to get the character set of response of your request HttpWebResponse class provides us a property named CharacterSet. This property returns the string type value.
myWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
myWebRequest.Method = "GET";
myWebResponse = (HttpWebResponse)myWebRequest.GetResponse();
string str = myWebResponse.CharacterSet;
If you want to get that which encoding method is used to encode the response, for this purpose we have a property of HttpWebRequest class named ContentEncoding. This property returns string value.
myWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
myWebRequest.Method = "GET";
myWebResponse = (HttpWebResponse)myWebRequest.GetResponse();
string str = myWebResponse.ContentEncoding;
Upvotes: 1
Reputation: 888293
kinopoisk.ru
is encoded as WINDOWS-1251
(you can see this in the Content-Type
header).
You need to pass Encoding.GetEncoding(1251)
to the StreamReader to decode that.
Upvotes: 7