Tim
Tim

Reputation: 201

Clicking next button in Scrapy

I am scraping the following website, https://www.trollandtoad.com/magic-the-gathering/aether-revolt/10066, and I am trying to click the next button to go to the next page and scrape it. I have done it on other programs and so I am just using the same code and just made modifications to work with the current website but it is not working. It only scrapes the first page.


    def parse(self, response):
        for game in response.css('div.card > div.row'):
            item = GameItem()
            item["Category"] = game.css("div.col-12.prod-cat a::text").get()
            item["Card_Name"]  = game.css("a.card-text::text").get()
            for buying_option in game.css('div.buying-options-table div.row:not(:first-child)'):
                item["Seller"] = buying_option.css('div.row.align-center.py-2.m-auto > div.col-3.text-center.p-1 > img::attr(title)').get()
                item["Condition"] = buying_option.css("div.col-3.text-center.p-1::text").get()
                item["Price"] = buying_option.css("div.col-2.text-center.p-1::text").get()
                yield item
            next_page = response.xpath('//a[contains(., "Next Page")]/@href').get()
            # If it exists and there is a next page enter if statement
            if next_page is not None:
                # Go to next page
                yield response.follow(next_page, self.parse)

UPDATE #1

Here is a snapshot of the HTML code for the next button

HTML for next page

UPDATE #2

Here is the updated code I have to try and go to next page. Still is not working but I think I am closer to right code.

next_page = response.xpath('//div[contains(., "Next Page")]/@class').get()
            # If it exists and there is a next page enter if statement
            if next_page is not None:
                # Go to next page
                yield response.follow(next_page, self.parse)

Upvotes: 0

Views: 1596

Answers (1)

gangabass
gangabass

Reputation: 10666

You need to find next page number and after that submit a form using it:

def parse(self, response):

    for game in response.css('div.card > div.row'):
        item = GameItem()
        item["Category"] = game.css("div.col-12.prod-cat a::text").get()
        item["Card_Name"]  = game.css("a.card-text::text").get()
        for buying_option in game.css('div.buying-options-table div.row:not(:first-child)'):
            item["Seller"] = buying_option.css('div.row.align-center.py-2.m-auto > div.col-3.text-center.p-1 > img::attr(title)').get()
            item["Condition"] = buying_option.css("div.col-3.text-center.p-1::text").get()
            item["Price"] = buying_option.css("div.col-2.text-center.p-1::text").get()
            yield item
    next_page_number = response.xpath('//div[div[.="Next Page"]][not(contains(@class, "hide"))]/@data-page').get()
    # If it exists and there is a next page enter if statement
    if next_page_number:
        yield scrapy.FormRequest.from_response(
            response=response,
            formid="category_form",
            formdata={
                'page-no': next_page_number,
            },
            callback=self.parse
        )

Upvotes: 2

Related Questions