Som
Som

Reputation: 307

Scrapy (Python) throws ImportError when running with cron

I'm running a scrapy spider with cron, but it throws an ImportError exception:

Traceback (most recent call last):
  File "/Users/som/scrapy_testing/scrapy_testing/spiders/hm_spiders.py", line 2, in <module>
    import scrapy
  File "/Library/Python/2.7/site-packages/scrapy/__init__.py", line 48, in <module>
    from scrapy.spiders import Spider
  File "/Library/Python/2.7/site-packages/scrapy/spiders/__init__.py", line 10, in <module>
    from scrapy.http import Request
  File "/Library/Python/2.7/site-packages/scrapy/http/__init__.py", line 12, in <module>
    from scrapy.http.request.rpc import XmlRpcRequest
  File "/Library/Python/2.7/site-packages/scrapy/http/request/rpc.py", line 7, in <module>
    from six.moves import xmlrpc_client as xmlrpclib
ImportError: cannot import name xmlrpc_client

The strange thing is that when I run the script that is being run by cron it works fine.

The cron is set as

*   *   *   *   *   sh /Users/som/sh/hm_scraping.sh

and the script is

#!/bin/bash
python /Users/som/scrapy_testing/scrapy_testing/spiders/hm_spiders.py

I'm using the CrawlerProcess class as described here: http://doc.scrapy.org/en/latest/topics/practices.html

process = CrawlerProcess({
    'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
})
process.crawl(HmSpider)
process.start()

================================================
EDIT

Based on MuhammadTahir and lapinkoira comments I tested the following directly in the terminal:

/usr/bin/python /Users/som/scrapy_testing/scrapy_testing/spiders/hm_spiders.py

and

sudo -u som /usr/bin/python /Users/som/scrapy_testing/scrapy_testing/spiders/hm_spiders.py

The first one runs fine, but when I use sudo (I've ran without setting the user as well) it returns the same problem. Maybe cron uses sudo in the background.

Any ideas??

Thanks!

Upvotes: 1

Views: 312

Answers (1)

gabrielhpugliese
gabrielhpugliese

Reputation: 2588

I would try one of both:

1- Activate the env first:

source /path/of/your/venv/bin/activate && /path/of/your/venv/bin/python /Users/som/scrapy_testing/scrapy_testing/spiders/hm_spiders.py

2- or without activating the env (may not work):

/path/of/your/venv/bin/python /Users/som/scrapy_testing/scrapy_testing/spiders/hm_spiders.py

Upvotes: 2

Related Questions