colla
colla

Reputation: 837

Proxy Error : BeautifulSoup HTTPSConnectionPool

My code is suppose to call the https://httpbin.org/ip to get my origin IP using a random proxy I have choosen in a list scraped from a website that provides a list of free proxies. However, when I run my code below, sometimes it returns a correct response (200 and with the correct response) and some of the time it returns :

MaxRetryError: HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ProxyError('Cannot connect to proxy.',  NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001EF83500DC8>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond')))

Traceback (most recent call last):

  File "<ipython-input-196-baf92a94e8ec>", line 19, in <module>
    response = s.get(url,proxies=proxyDict)

This is the code I am using

import requests
from bs4 import BeautifulSoup

res = requests.get('https://free-proxy-list.net/', headers={'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(res.text,"lxml")
proxies = []
for items in soup.select("#proxylisttable tbody tr"):
    proxy_list = ':'.join([item.text for item in items.select("td")[:2]])
    proxies.append(proxy_list)

url = 'https://httpbin.org/ip'
choosenProxy = random.choice(proxies)
proxyDict = {
    'http' : 'http://'+str(choosenProxy),
    'https' : 'https://'+str(choosenProxy)
    }
s = requests.Session()
response = s.get(url,proxies=proxyDict)
print(response.text)

What does the error mean ? Is there a way I could fix this ?

Upvotes: 0

Views: 1061

Answers (1)

SIM
SIM

Reputation: 22440

Try the following solution. It will keep trying with different proxies until it find a working one. Once it finds a working proxy, the script should give you the required response and break the loop.

import random
import requests
from bs4 import BeautifulSoup

url = 'https://httpbin.org/ip'

proxies = []

res = requests.get('https://free-proxy-list.net/', headers={'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(res.text,"lxml")
for items in soup.select("#proxylisttable tbody tr"):
    proxy_list = ':'.join([item.text for item in items.select("td")[:2]])
    proxies.append(proxy_list)

while True:
    choosenProxy = random.choice(proxies)
    proxyDict = {
        'http' : f'http://{choosenProxy}',
        'https' : f'https://{choosenProxy}'
    }
    print("trying with:",proxyDict)
    try:
        response = requests.get(url,proxies=proxyDict,timeout=5)
        print(response.text)
        break
    except Exception:
        continue

Upvotes: 1

Related Questions