Reputation: 81
I want to get the HTML from this URL: https://store.steampowered.com/app/513710/SCUM/
This should be easy, but I couldn't do it due to an SSL/TLS error.
So I used code from this question: Requesting html over https with c# Webclient
Finally I could fill my StreamReader, but when I try to use ReadToEnd() with a string, I get a corrupt string, something like this: "�"
This must be something about character encoding, but if you open: https://store.steampowered.com/app/513710/SCUM/
And then open your browser console, you can see at the beginning:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
While at the provided code:
webClient.Headers["Accept-Charset"] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7";
You have utf-8, so I just have no idea why I'm having this problem. I tried to replace:
StreamReader(webClient.OpenRead(steamURL));
With:
StreamReader(webClient.OpenRead(steamURL), Encoding.UTF8, true);
But it still didn't get the proper text. I tried to add all the information I could, if you need anything else I'll edit the question.
Thank you for your time and have a nice day.
Regards,
David
PS: This is my code right now:
private StreamReader getStreamReader(string steamURL, WebClient webClient)
{
return new StreamReader(webClient.OpenRead(steamURL), Encoding.UTF8, true);
}
private void getSteamCosts()
{
// When I try to access an Steam HTML, SSL error appears
// We need an specific security protocol
// I check all, just in case
ServicePointManager.ServerCertificateValidationCallback =
new RemoteCertificateValidationCallback(
delegate
{
return true;
});
using (WebClient webClient = new WebClient())
{
webClient.Headers["User-Agent"] = "Mozilla/5.0 (Windows;"
+ " U; Windows NT 6.0; en-US; rv:1.9.2.6) Gecko/20100625"
+ " Firefox/3.6.6 (.NET CLR 3.5.30729)";
webClient.Headers["Accept"] = "text/html,application/xhtml+"
+ "xml,application/xml;q=0.9,*/*;q=0.8";
webClient.Headers["Accept-Language"] = "en-us,en;q=0.5";
webClient.Headers["Accept-Encoding"] = "gzip,deflate";
webClient.Headers["Accept-Charset"] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7";
StreamReader sr = null;
string steamURL = "https://store.steampowered.com/app/513710/SCUM/";
try
{
// This one should work
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
sr = getStreamReader(steamURL, webClient);
lbFinalSteam.Text = "TLS12Final";
}
catch (Exception) // Bad coding practice, just wanted it to work
{
// If that's not the case, I try the rest
try
{
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls;
sr = getStreamReader(steamURL, webClient);
lbFinalSteam.Text = "TLSFinal";
}
catch (Exception)
{
try
{
ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3;
sr = getStreamReader(steamURL, webClient);
lbFinalSteam.Text = "SSL3Final";
}
catch (Exception)
{
try
{
ServicePointManager.SecurityProtocol =
SecurityProtocolType.Tls11;
sr = getStreamReader(steamURL, webClient);
lbFinalSteam.Text = "TLS11Final";
}
catch (Exception)
{
lbFinalSteam.Text = "NoFinal";
}
}
}
}
if (sr != null)
{
string allLines = sr.ReadToEnd();
}
}
}
edit: maybe the problem is how I transform the StreamReader into a string? I mean this line:
string allLines = sr.ReadToEnd();
Should I use anything else?
Upvotes: 1
Views: 427
Reputation: 81
As https://stackoverflow.com/users/246342/alex-k already wrote, the problem wasn't the encoding but I was getting a compressed Gzimp. I just removed this:
webClient.Headers["Accept-Encoding"] = "gzip,deflate";
And it works! Thank you Alex K! :D
Upvotes: 1