Mirage
Mirage

Reputation: 31548

How can make two requests simultaneously with scrapy

I am scraping the job sites where the first page ahs the links to all the jobs. Now i am storing the title , job , company from the first page.

But i also want to store the description , which is available by clicking on the job title. I want to store that as well with the current items.

This is my curent code

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    sites = hxs.select("//div[@class='jobenteries']")
    items = []
    for site in sites[:3]:
        print "Hello"
        item = DmozItem()
        item['title'] = site.select('a/text()').extract()
        item['desc'] = ''
        items.append(item)
    return items

But that description is on the next page link. how can i do that

Upvotes: 0

Views: 293

Answers (1)

Shane Evans
Shane Evans

Reputation: 2254

From the first page, return Requests for the second page and pass the data for each item in the request.meta dict. On the callback method for the second page you can read the data you passed and return the fully populated item.

See Passing additional data to callback functions in the scrapy docs for more details and an example.

Upvotes: 3

Related Questions