virvaldium
virvaldium

Reputation: 226

how to make an additional request and get data from it

I need to parse data from the site. After parsing, data must be saved to disk. I am using scrapy. When working, I need to get data from another page. How can I do that?

class MySpider(scrapy.Spider):

    name = "my_spyder"

    start_urls = [
        'https://www.example.com/title/1',
        'https://www.example.com/title/2',
        'https://www.example.com/title/3',
    ]

    def parse(self, response):
       item = MyItem()
       main_page_selector = Selector(response)
       ...
       tagline_url = os.path.join(response.url, 'taglines')
       request = Request(url=tagline_url, callback=get_tags)   
       item['tags'] = yield request
       ...
       yield item

    def get_tags(self, response):
        tagline_selector = Selector(response)
        taglines = []
        for tag in tagline_selector.xpath('//div[@class="soda even"))]/text()').getall():
            taglines.append(tag.strip())

        return taglines

how to write in the 'item' field 'tags' received during the function 'get_tags'? these requests are executed asynchronously.

Upvotes: 1

Views: 101

Answers (1)

Zhd Zilin
Zhd Zilin

Reputation: 143

request = Request(url=tagline_url, callback=get_tags)
request.meta["item"] = item
yield request

Above code on parse method

item = response.meta["item"]
#...
item["tags"] = taglines
yield item

The second code in the get_tags method

Upvotes: 1

Related Questions