Kurt Peek
Kurt Peek

Reputation: 57461

How to pass two user-defined arguments to a scrapy spider

Following How to pass a user defined argument in scrapy spider, I wrote the following simple spider:

import scrapy

class Funda1Spider(scrapy.Spider):
    name = "funda1"
    allowed_domains = ["funda.nl"]

    def __init__(self, place='amsterdam'):
        self.start_urls = ["http://www.funda.nl/koop/%s/" % place]

    def parse(self, response):
        filename = response.url.split("/")[-2] + '.html'
        with open(filename, 'wb') as f:
            f.write(response.body)

This seems to work; for example, if I run it from the command line with

scrapy crawl funda1 -a place=rotterdam

It generates a rotterdam.html which looks similar to http://www.funda.nl/koop/rotterdam/. I would next like to extend this so that one can specify a subpage, for instance, http://www.funda.nl/koop/rotterdam/p2/. I've tried the following:

import scrapy

class Funda1Spider(scrapy.Spider):
    name = "funda1"
    allowed_domains = ["funda.nl"]

    def __init__(self, place='amsterdam', page=''):
        self.start_urls = ["http://www.funda.nl/koop/%s/p%s/" % (place, page)]

    def parse(self, response):
        filename = response.url.split("/")[-2] + '.html'
        with open(filename, 'wb') as f:
            f.write(response.body)

However, if I try to run this with

scrapy crawl funda1 -a place=rotterdam page=2

I get the following error:

crawl: error: running 'scrapy crawl' with more than one spider is no longer supported

I don't really understand this error message, as I'm not trying to crawl two spiders, but simply trying to pass two keyword arguments to modify the start_urls. How could I make this work?

Upvotes: 3

Views: 857

Answers (1)

Granitosaurus
Granitosaurus

Reputation: 21436

When providing multiple arguments you need to prefix -a for every argument.

The correct line for your case would be:

scrapy crawl funda1 -a place=rotterdam -a page=2

Upvotes: 6

Related Questions