Reputation: 3032
I am trying to get HTML content from the amazon website. Here is my code to create request, response, and get string:
public static HttpWebResponse GetHttpWebResponse(string url)
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.ContentType = "text/xml";
try
{
return (HttpWebResponse)webRequest.GetResponse();
}
catch (WebException e)
{
if (e.Response == null)
throw new Exception("Cannot get response");
return (HttpWebResponse)e.Response;
}
}
public static string GetString(HttpWebResponse response)
{
Encoding encoding = Encoding.UTF8;
using (var reader = new StreamReader(response.GetResponseStream(), encoding))
{
string responseText = reader.ReadToEnd();
return responseText;
}
}
It is working fine with other web sites. However, when I try to get content from amazon, for example: https://www.amazon.com/gp/product/B00AEISSHA/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1 I am seeing encoded content:
I tried to change Encoding
and used HttpUtility.HtmlDecode(html);
but it couldn't help. Is there any simple way to get content from Amazon?
Upvotes: 0
Views: 592
Reputation: 382
You're not catering for compression. If you update your webrequest like this, it should do the trick.
public static HttpWebResponse GetHttpWebResponse(string url)
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.ContentType = "text/xml";
webRequest.AutomaticDecompression = DecompressionMethods.GZip;
try
{
return (HttpWebResponse)webRequest.GetResponse();
}
catch (WebException e)
{
if (e.Response == null)
throw new Exception("Cannot get response");
return (HttpWebResponse)e.Response;
}
}
Upvotes: 4