Reputation: 41
I am trying to simply deploy a Scrapy Spider to ScrapingHub using their rules provided. For some reason, it is searching for a Python 3.6 directory specifically, when it should be able to search for any 3.x Python directory. My spider is written on Python 3.5, and this is an issue. Scrapinghub says that identifying "scrapy:1.4-py3" will work for 3.x Python set, but this is obviously not true.
Also, for some reason, it can't seem to find my spider in the project. Is this related to the issue with the 3.6 directory.
Finally, I have installed everything needed from the requirements file.
C:\Users\Desktop\Empery Code\YahooScrape>shub deploy
Packing version 1.0
Deploying to Scrapy Cloud project "205357"
Deploy log last 30 lines:
Deploy log location: C:\Users\AppData\Local\Temp\shub_deploy_of5_m4
qg.log
Error: Deploy failed: b'{"status": "error", "message": "Internal build error"}'
_run(args, settings)
File "/usr/local/lib/python3.6/site-packages/sh_scrapy/crawl.py", line 103, in
_run
_run_scrapy(args, settings)
File "/usr/local/lib/python3.6/site-packages/sh_scrapy/crawl.py", line 111, in
_run_scrapy
execute(settings=settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/cmdline.py", line 148, in
execute
cmd.crawler_process = CrawlerProcess(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/crawler.py", line 243, in
__init__
super(CrawlerProcess, self).__init__(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/crawler.py", line 134, in
__init__
self.spider_loader = _get_spider_loader(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/crawler.py", line 330, in
_get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/usr/local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 61,
in from_settings
return cls(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 25,
in __init__
self._load_all_spiders()
File "/usr/local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 47,
in _load_all_spiders
for module in walk_modules(name):
File "/usr/local/lib/python3.6/site-packages/scrapy/utils/misc.py", line 63, i
n walk_modules
mod = import_module(path)
File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_mod
ule
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 978, in _gcd_import
File "<frozen importlib._bootstrap>", line 961, in _find_and_load
File "<frozen importlib._bootstrap>", line 948, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'YahooScrape.spiders'
{"message": "list-spiders exit code: 1", "details": null, "error": "build_error"
}
{"status": "error", "message": "Internal build error"}
C:\Users\Desktop\Empery Code\YahooScrape>\
Scrapy.cfg file:
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html
[settings]
default = YahooScrape.settings
[deploy]
#url = http://localhost:6800/
project = YahooScrape
Scrapinghub.yml code:
project: -----
requirements:
file: requirements.txt
stacks:
default: scrapy:1.4-py3
Upvotes: 1
Views: 630
Reputation: 20748
Make sure your directory tree looks like this:
$ tree
.
├── YahooScrape
│ ├── __init__.py
│ ├── items.py
│ ├── middlewares.py
│ ├── pipelines.py
│ ├── settings.py
│ └── spiders
│ ├── yahoo.py
│ └── __init__.py
├── requirements.txt
├── scrapinghub.yml
├── scrapy.cfg
└── setup.py
Pay special attention to YahooScrape/spiders/
. It should contain a __init__.py
file (an empty one is fine), and your different spiders, usually as seperate .py
files.
Otherwise YahooScrape.spiders
cannot be understood as a Python module, hence the "ModuleNotFoundError: No module named 'YahooScrape.spiders'"
message.
Upvotes: 2