Liondancer
Liondancer

Reputation: 16469

Scrapy with proxies using downloader middleware

I'm new to Scrapy and I am trying to build my own Downloader Middleware in order to go through a proxy to scrape the web. I am getting this error:

Traceback (most recent call last):
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks
    result = g.send(result)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 90, in crawl
    six.reraise(*exc_info)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 72, in crawl
    self.engine = self._create_engine()
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 97, in _create_engine
    return ExecutionEngine(self, lambda _: self.stop())
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/core/engine.py", line 68, in __init__
    self.downloader = downloader_cls(crawler)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/core/downloader/__init__.py", line 88, in __init__
    self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/middleware.py", line 34, in from_settings
    mwcls = load_object(clspath)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
    mod = import_module(module)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
ImportError: No module named downloaders.downloader_middlewares.proxy_connect

This error is due to Scrapy not being able to find my middleware. I'm not sure if this is caused because I'm not setting up the correct path or if I am doing something wrong with my middleware.

This is my project structure:

/chisel
    __init__.py
    pipelines.py
    items.py
    settings.py
    /downloaders
        __init__.py
        /downloader_middlewares
            __init__.py
        proxy_connect.py
        /resources
          config.json
    /spiders
        __init__.py
        craiglist_spider.py
        /spider_middlewares
            __init__.py
        /resources
          craigslist.json
scrapy.cfg

Within my my settings.py I have

DOWNLOADER_MIDDLEWARES = {
    'downloaders.downloader_middlewares.proxy_connect.ProxyConnect': 100,
    'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110
}

Upvotes: 0

Views: 591

Answers (1)

Wilfredo
Wilfredo

Reputation: 1548

According to the docs the path should include the project ('myproject.middlewares.CustomDownloaderMiddleware') in your case I think it should be:

'chisel.downloaders.downloader_middlewares.proxy_connect.ProxyConnect': 100

Upvotes: 1

Related Questions