Reputation: 952
EDIT - FIXED tldr, semi-old version of python installed a couple years ago had ssl package that was not updated to handle newer SSL certificates. After updating Python and making sure the ssl package was up to date, everything worked.
I'm new to web scraping, and wanted to scrape a certain site, but for some reason I'm getting errors when using the Python's Requests package on this particular site.
I am working on secure login to scrape data from my user profile. The login address can be found here: https://secure.funorb.com/m=weblogin/loginform.ws?mod=hiscore_fo&ssl=0&expired=0&dest=
I'm just trying to perform simple tasks at this point, like printing the text from a get request. The following is my code.
import requests
req = requests.get('https://secure.funorb.com/m=weblogin/loginform.ws?mod=hiscore_fo&ssl=0&expired=0&dest=',verify=False)
print req.text
When I run this, an error is thrown:
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 512, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:590)
I've looked in this file to see what's going on. It seems the culprit is
except (_SSLError, _HTTPError) as e:
if isinstance(e, _SSLError):
raise SSLError(e, request=request)
elif isinstance(e, ReadTimeoutError):
raise ReadTimeout(e, request=request)
else:
raise
I'm not really sure how to avoid this unfortunately, I'm kind of at my debugging limit here.
My code works just fine on other secure sites, such as https://bitbucket.org/account/signin/. I've looked at a ton of solutions on stack exchange and around the net, and a lot of people claimed adding in the optional argument "verify=False" should fix these types of SSL errors (ableit it's not the most secure way to do it). But as you can see from my code snippet, this isn't helping me.
If anyone can get this working/give advice on where to go it would be much appreciated.
Upvotes: 4
Views: 1412
Reputation: 123561
... lot of people claimed adding in the optional argument "verify=False" should fix these types of SSL errors
adding verify=False
helps against errors when validating the certificate, but not against EOF from server, handshake errors or similar.
As can be seen from SSLLabs this specific server exhibits the behavior of simply closing the connection (i.e. "EOF occurred in violation of protocol") for clients which don't support TLS 1.2 with modern ciphers. While you don't specify which SSL version you use I expect it to be a version less than OpenSSL 1.0.1, the first version of OpenSSL supporting TLS 1.2.
Please check ssl.OPENSSL_VERSION
for the version used in your code. If I'm correct your only fix is to upgrade the version of OpenSSL use by Python. How this is done depends on your platform but there are existing posts about it, like Updating openssl in python 2.7.
Upvotes: 4
Reputation: 7589
Seen it somewhere else. What if you try using sessions like this:
import requests
sess = requests.Session()
adapter = requests.adapters.HTTPAdapter(max_retries = 20)
sess.mount('http://', adapter)
Then, change requests.get()
with sess.get()
If you want to keep working with requests, maybe you need to install ndg-httpsclient package.
Upvotes: 1