Reputation: 375
I had a lot free proxies in a txt file, and now I want to use them as proxies to crawl website, but when I use the proxies, like 127.0.0.1 below, how can I judge the proxy is still available to use?
proxy = urllib2.ProxyHandler({'http': '127.0.0.1'}) opener = urllib2.build_opener(proxy) urllib2.install_opener(opener) urllib2.urlopen('http://www.google.com')
Upvotes: 0
Views: 1015
Reputation: 534
Use this function:
def is_OK(ip):
print 'Trying %s ...' % ip
try:
proxy_handler = urllib2.ProxyHandler({'http': ip})
opener = urllib2.build_opener(proxy_handler)
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
req=urllib2.Request('http://www.icanhazip.com')
urllib2.urlopen(req)
print '%s is OK' % ip
return True
except urllib2.HTTPError:
print '%s is not OK' % ip
except Exception:
print '%s is not OK' % ip
return False
From this answer: Python, checking if a proxy is alive?
So you'd just iterate over the file (assuming 1 IP address per line) and check if is_OK() returns True:
with open('ip_addresses.txt') as fp:
for ip in fp:
if is_OK(ip) is True:
do_something();
Upvotes: 0