Reputation: 309
How can I run 2 spiders in series? Running this runs the first spider but not the second. Is there a way to wait for one to finish?:
from scrapy import cmdline
cmdline.execute("scrapy crawl spider1".split())
cmdline.execute("scrapy crawl spider2".split())
Edit1: I changed it using .wait() to:
spider1 = subprocess.Popen(cmdline.execute("scrapy crawl spider1".split()))
spider1.wait()
spider2 = subprocess.Popen(cmdline.execute("scrapy crawl spider2".split()))
spider2.wait()
Did I do it wrong because it will just runs the first one
Edit2:
Traceback (most recent call last):
File "/usr/bin/scrapy", line 9, in <module>
load_entry_point('Scrapy==0.24.6', 'console_scripts', 'scrapy')()
File "/usr/lib/pymodules/python2.7/scrapy/cmdline.py", line 109, in execute
settings = get_project_settings()
File "/usr/lib/pymodules/python2.7/scrapy/utils/project.py", line 60, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "/usr/lib/pymodules/python2.7/scrapy/settings/__init__.py", line 109, in setmodule
module = import_module(module)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named settings
1
Upvotes: 0
Views: 406
Reputation: 8786
I would use Subprocess, which has a .wait() function. Or you could use .call()
in subprocess, which automatically waits and print it to get the terminal text from calling the scrapy crawl
.
spider1 = subprocess.call(["scrapy", "crawl", "spider1"])
print spider1
spider2 = subprocess.call(["scrapy", "crawl", "spider2"])
print spider2
This method will automatically wait until the first spider is done and then call the seconds spider
Upvotes: 2