Rahul soni
Rahul soni

Reputation: 23

not able to iterate through multiple pages while scraping data

So, I have to scrape the review and ratings for the product on flipkart. I need to scrape at least 30-40 review and rating. So for that I have to click on the next page as on 1st page only 10 reviews are present. Below is the code that I am using to check if my code is able to click on the next page.

'''

driver =webdriver.Chrome(r"chromedriver.exe")

'''

driver.get('https://www.flipkart.com/hp-15s-ryzen-3-dual-core-3250u-8-gb-1-tb-hdd-256-gb-ssd-windows-10-home-15s-gr0012au-laptop/product-reviews/itm9e1f8deeed35f?pid=COMFZHFWBE7APPH2&lid=LSTCOMFZHFWBE7APPH2AR705G&marketplace=FLIPKART&page=2)

'''

for page in range(4):
   
   try:
       next_butt = driver.find_element_by_xpath("//nav[@class='yFHi8N']/a/span")

       if next_butt.text == 'NEXT':
           next_butt.click()
   except NoSuchElementException:
       continue
time.sleep(1)

When ever I run this code what I observe that it is able to click on next button but after 1st iteration it clicks on pervious button, so I am not getting ahead.

Please help.

Upvotes: 1

Views: 59

Answers (1)

cruisepandey
cruisepandey

Reputation: 29382

Look at this URL which you have shared :

https://www.flipkart.com/hp-15s-ryzen-3-dual-core-3250u-8-gb-1-tb-hdd-256-gb-ssd-windows-10-home-15s-gr0012au-laptop/product-reviews/itm9e1f8deeed35f?pid=COMFZHFWBE7APPH2&lid=LSTCOMFZHFWBE7APPH2AR705G&marketplace=FLIPKART&page=2

in the end you would see page = 2, so if I change that to page = 3 I would see 3rd page reviews without having Selenium bot to click on Next button.

So what I would do here is to parse the page_number int variable like below :

sample code :

driver.maximize_window()
page_number = 1
for page in range(4):
    driver.get("https://www.flipkart.com/hp-15s-ryzen-3-dual-core-3250u-8-gb-1-tb-hdd-256-gb-ssd-windows-10-home-15s-gr0012au-laptop/product-reviews/itm9e1f8deeed35f?pid=COMFZHFWBE7APPH2&lid=LSTCOMFZHFWBE7APPH2AR705G&marketplace=FLIPKART&page=%s" % page_number)
    #scrape anything you want here
    page_number = page_number + 1
    sleep(5)

Upvotes: 1

Related Questions