user8511791
user8511791

Reputation: 45

How to use scrapy with auto-rotating proxy?

I have an auto-rotating proxy that I got from storm proxies but I don't know how to properly use it with scrapy. The IP remains the same for all the requests I make. The support at storm proxies says that the current connection needs to be closed for the IP to change.

But I don't know how to close the connection or create a new request each time or is there any other way?

That's my current code.

import scrapy
import scraper_helper

class EbayfastSpider(scrapy.Spider):
    name = 'test'
    custom_settings = {
        'CONCURRENT_REQUESTS': 10,
        'CONCURRENT_REQUESTS_PER_IP' : 1
        }

    header_string = """
    user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36
    x-requested-with: XMLHttpRequest
    """
    headers = scraper_helper.get_dict(header_string)

    def start_requests(self):
        url='https://httpbin.org/ip'
        for index in range(10):
            yield scrapy.Request(url=url, callback=self.parse,headers= self.headers, dont_filter=True,
                meta={'proxy': 'rotating_proxy'}
                )
            print(index)            
    
    def parse(self,response):
        print(response.text)

Upvotes: 1

Views: 825

Answers (1)

Pouya Esmaeili
Pouya Esmaeili

Reputation: 1263

You can signal to close the current connection by adding Connection: close to the request header. It means connection will be closed after responding. It will also cause some delay in scraping because of repeating TCP 3-way handshaking.

Upvotes: 2

Related Questions