Lakshmipathi
Lakshmipathi

Reputation: 121

problem with python requests while using proxies

I am trying to scrape a website using python requests. We can only scrape the website using proxies so I implemented the code for that. However its banning all my requests even when i am using proxies, So I used a website https://api.ipify.org/?format=json to check whether proxies working properly or not. I found it showing my original IP even while using proxies. The code is below

from concurrent.futures import ThreadPoolExecutor

import string, random

import requests

import sys



http = []

#loading http into the list
with open(sys.argv[1],"r",encoding = "utf-8") as data:

    for i in data:
        http.append(i[:-1])
    data.close()

url = "https://api.ipify.org/?format=json"

def fetch(session, url):
    
for i in range(5):
    
      
                proxy = {'http': 'http://'+random.choice(http)}

               
                try:
                        with session.get(url,proxies = proxy, allow_redirects=False) as response:
                            print("Proxy : ",proxy," | Response : ",response.text)
                            break
                except:
                        pass



# @timer(1, 5)

if __name__ == '__main__':
    with ThreadPoolExecutor(max_workers=1) as executor:
        with requests.Session() as session:
            executor.map(fetch, [session] * 100, [url] * 100)
            executor.shutdown(wait=True)

I tried a lot but didn't understand how my ip address is getting shown instead of the proxy ipv4. You will find output of the code here https://i.sstatic.net/Mk03j.jpg

Upvotes: 0

Views: 1660

Answers (1)

Olvin Roght
Olvin Roght

Reputation: 7812

The problem that you have set proxy for http and sending request to website which uses https. Solution is simple:

proxies = dict.fromkeys(('http', 'https', 'ftp'), 'http://' + random.choice(http))
# You can set proxy for session
session.proxies.update(proxies)
response = session.get(url)
# Or you can pass proxy as argument
response = session.get(url, proxies=proxies)

Upvotes: 1

Related Questions