How to use Downloader Middleware in Scrapy

Question

I am using scrapy to scrape some web pages. I wrote my customised ProxyMiddleware class in which I implemented my requirement in process_request(self,request,spider) method. Here is my code(copied):

class ProxyMiddleware(scrapy.downloadermiddlewares.httpproxy):
def __init__(self, proxy_ip=''):
    self.proxy_ip = proxy_ip

def process_request(self,request,spider):
    ip = random.choice(self.proxy_list)
    if ip:
        request.meta['proxy'] = ip
    return request

proxy_list = [list of proxies]

Now, I didn't understand how scrapy will consider my implementation instead of default class. After some searching and brainstorming, what I understood is, I need to make changes in settings.py

DOWNLOADER_MIDDLEWARES = {
    'IPProxy.middlewares.MyCustomDownloaderMiddleware': 543,
    'IPProxy.IPProxy.spiders.RandomProxy': 600
}

For better understanding of my project structure to readers, I added second element in the list with some random value. My project structure is:

My question is,

How to use DOWNLOADER_MIDDLEWARES in settings.py correctly
How to assign the values to the elements in DOWNLOADER_MIDDLEWARES
How to make scrapy to call my customized code instead of the default

How to use Downloader Middleware in Scrapy

Answers (1)

Related Questions