How can make two requests simultaneously with scrapy

Question

I am scraping the job sites where the first page ahs the links to all the jobs. Now i am storing the title , job , company from the first page.

But i also want to store the description , which is available by clicking on the job title. I want to store that as well with the current items.

This is my curent code

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    sites = hxs.select("//div[@class='jobenteries']")
    items = []
    for site in sites[:3]:
        print "Hello"
        item = DmozItem()
        item['title'] = site.select('a/text()').extract()
        item['desc'] = ''
        items.append(item)
    return items

But that description is on the next page link. how can i do that

Shane Evans · Accepted Answer

From the first page, return Requests for the second page and pass the data for each item in the request.meta dict. On the callback method for the second page you can read the data you passed and return the fully populated item.

See Passing additional data to callback functions in the scrapy docs for more details and an example.

How can make two requests simultaneously with scrapy

Answers (1)

Related Questions