Reputation: 45
I have an auto-rotating proxy that I got from storm proxies but I don't know how to properly use it with scrapy. The IP remains the same for all the requests I make. The support at storm proxies says that the current connection needs to be closed for the IP to change.
But I don't know how to close the connection or create a new request each time or is there any other way?
That's my current code.
import scrapy
import scraper_helper
class EbayfastSpider(scrapy.Spider):
name = 'test'
custom_settings = {
'CONCURRENT_REQUESTS': 10,
'CONCURRENT_REQUESTS_PER_IP' : 1
}
header_string = """
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36
x-requested-with: XMLHttpRequest
"""
headers = scraper_helper.get_dict(header_string)
def start_requests(self):
url='https://httpbin.org/ip'
for index in range(10):
yield scrapy.Request(url=url, callback=self.parse,headers= self.headers, dont_filter=True,
meta={'proxy': 'rotating_proxy'}
)
print(index)
def parse(self,response):
print(response.text)
Upvotes: 1
Views: 825
Reputation: 1263
You can signal to close the current connection by adding Connection: close
to the request header. It means connection will be closed after responding. It will also cause some delay in scraping because of repeating TCP 3-way handshaking.
Upvotes: 2