Reputation: 18745
I'm little confused about requests
module, especially proxies.
From documentation:
PROXIES
Dictionary mapping protocol to the URL of the proxy (e.g. {‘http’: ‘foo.bar:3128’}) to be used on each Request.
May there be more proxies of one type in the dictionary? I mean is it possible to put there list of proxies and requests
module will try them and use only those which are working?
Or there can be only one proxy address for example for http
?
Upvotes: 0
Views: 5024
Reputation: 19378
Well, actually you can, I've done this with a few lines of code and it works pretty well.
import requests
class Client:
def __init__(self):
self._session = requests.Session()
self.proxies = None
def set_proxy_pool(self, proxies, auth=None, https=True):
"""Randomly choose a proxy for every GET/POST request
:param proxies: list of proxies, like ["ip1:port1", "ip2:port2"]
:param auth: if proxy needs auth
:param https: default is True, pass False if you don't need https proxy
"""
from random import choice
if https:
self.proxies = [{'http': p, 'https': p} for p in proxies]
else:
self.proxies = [{'http': p} for p in proxies]
def get_with_random_proxy(url, **kwargs):
proxy = choice(self.proxies)
kwargs['proxies'] = proxy
if auth:
kwargs['auth'] = auth
return self._session.original_get(url, **kwargs)
def post_with_random_proxy(url, *args, **kwargs):
proxy = choice(self.proxies)
kwargs['proxies'] = proxy
if auth:
kwargs['auth'] = auth
return self._session.original_post(url, *args, **kwargs)
self._session.original_get = self._session.get
self._session.get = get_with_random_proxy
self._session.original_post = self._session.post
self._session.post = post_with_random_proxy
def remove_proxy_pool(self):
self.proxies = None
self._session.get = self._session.original_get
self._session.post = self._session.original_post
del self._session.original_get
del self._session.original_post
# You can define whatever operations using self._session
I use it like this:
client = Client()
client.set_proxy_pool(['112.25.41.136', '180.97.29.57'])
It's simple, but actually works for me.
Upvotes: 1
Reputation: 11596
Using the proxies
parameter is limited by the very nature of a python dictionary (i.e. each key must be unique).
import requests
url = 'http://google.com'
proxies = {'https': '84.22.41.1:3128',
'http': '185.26.183.14:80',
'http': '178.33.230.114:3128'}
if __name__ == '__main__':
print url
print proxies
response = requests.get(url, proxies=proxies)
if response.status_code == 200:
print response.text
else:
print 'Response ERROR', response.status_code
outputs
http://google.com
{'http': '178.33.230.114:3128', 'https': '84.22.41.1:3128'}
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for."
...more html...
As you can see, the value of the http
protocol key in the proxies
dictionary corresponds to the last encountered in its assignment (i.e. 178.33.230.114:3128
). Try swapping the http
entries around.
So, the answer is no, you cannot specify multiple proxies for the same protocol using a simple dictionary.
I have tried using an iterable as a value, which would make sense to me
proxies = {'https': '84.22.41.1:3128',
'http': ('178.33.230.114:3128', '185.26.183.14:80', )}
but with no luck, it produces an error
Upvotes: 3