Reputation: 121578
Say I have This spider:
class SomeSPider(Spider):
name ='spname'
Then I can crawl my spider, by creating a new instance of SomeSpider and call the crawler like this for example:
spider= SomeSpider()
crawler = Crawler(settings)
crawler.configure()
crawler.crawl(spider)
....
Can I Do the same thing using just the spider name? I mean 'spname' ?
crawler.crawl('spname') ## I give just the spider name here
How to dynamically create the Spider ? I guess the scrapy manager do it internally, since this works fine:
Scrapy crawl spname
One solution, is to parse my spiders folders , get all Spiders classes and filter them using name attribute? but this looks like a far-fetched solution!
Thank you in advance for your help.
Upvotes: 2
Views: 1672
Reputation: 121578
Inspired by @kev answer, here a function that inspect spider class:
from scrapy.utils.misc import walk_modules
from scrapy.utils.spider import iter_spider_classes
def _load_spiders(module='spiders.SomeSpider'):
for module in walk_modules(module):
for spcls in iter_spider_classes(module):
self._spiders[spcls.name] = spcls
Then you can instantiate :
somespider = self._spiders['spname']()
Upvotes: 1
Reputation: 161754
Please take a look at the source code:
# scrapy/commands/crawl.py
class Command(ScrapyCommand):
def run(self, args, opts):
...
# scrapy/spidermanager.py
class SpiderManager(object):
def _load_spiders(self, module):
...
def create(self, spider_name, **spider_kwargs):
...
# scrapy/utils/spider.py
def iter_spider_classes(module):
"""Return an iterator over all spider classes defined in the given module
that can be instantiated (ie. which have name)
"""
...
Upvotes: 3