agstudy
agstudy

Reputation: 121578

How does Scrapy find Spider class by its name?

Say I have This spider:

class SomeSPider(Spider):
     name ='spname'

Then I can crawl my spider, by creating a new instance of SomeSpider and call the crawler like this for example:

spider= SomeSpider()
crawler = Crawler(settings)
crawler.configure()
crawler.crawl(spider)
....

Can I Do the same thing using just the spider name? I mean 'spname' ?

crawler.crawl('spname') ## I give just the spider name here

How to dynamically create the Spider ? I guess the scrapy manager do it internally, since this works fine:

Scrapy crawl spname   

One solution, is to parse my spiders folders , get all Spiders classes and filter them using name attribute? but this looks like a far-fetched solution!

Thank you in advance for your help.

Upvotes: 2

Views: 1672

Answers (2)

agstudy
agstudy

Reputation: 121578

Inspired by @kev answer, here a function that inspect spider class:

from scrapy.utils.misc import walk_modules
from scrapy.utils.spider import iter_spider_classes

def _load_spiders(module='spiders.SomeSpider'):
    for module in walk_modules(module):
        for spcls in iter_spider_classes(module):
            self._spiders[spcls.name] = spcls

Then you can instantiate :

somespider = self._spiders['spname']()

Upvotes: 1

kev
kev

Reputation: 161754

Please take a look at the source code:

# scrapy/commands/crawl.py

class Command(ScrapyCommand):

    def run(self, args, opts):
        ...

# scrapy/spidermanager.py

class SpiderManager(object):

    def _load_spiders(self, module):
        ...

    def create(self, spider_name, **spider_kwargs):
        ...

# scrapy/utils/spider.py

def iter_spider_classes(module):
    """Return an iterator over all spider classes defined in the given module
    that can be instantiated (ie. which have name)
    """
    ...

Upvotes: 3

Related Questions