Recursively iterate over multiple web pages and scrape using selenium

Question

This is a follow up question to the query which I had about scraping web pages.

My earlier question: Pin down exact content location in html for web scraping urllib2 Beautiful Soup

This question is regarding doing the same, but the issue is to do the same recursively over multiple page s/views.

Here is my code

from selenium.webdriver.firefox import web driver

driver = webdriver.WebDriver()
driver.get('http://www.walmart.com/ip/29701960?page=seeAllReviews')

for review in driver.find_elements_by_class_name('BVRRReviewDisplayStyle3Main'):

    title = review.find_element_by_class_name('BVRRReviewTitle').text
    rating =review.find_element_by_xpath('.//div[@class="BVRRRatingNormalImage"]//img').get_attribute('title')
print title, rating

From the url, you'll see that no change is seen if we navigate to the second page, otherwise it wouldn't have been an issue. In this case, the next page clicker calls in a javascript from the server. Is there a way we can still scrape this using selenium in python just by some slight modification of my presented code ? Please let me know if there is.

Thanks.

Richard · Accepted Answer

I think this would work. Although the python might be a little off, this should give you a starting point:

continue = True
while continue:
    try:
        for review in driver.find_elements_by_class_name('BVRRReviewDisplayStyle3Main'):
            title = review.find_element_by_class_name('BVRRReviewTitle').text
            rating =review.find_element_by_xpath('.//div[@class="BVRRRatingNormalImage"]//img').get_attribute('title')
        print title, rating
        driver.find_element_by_name('BV_TrackingTag_Review_Display_NextPage').click()
    except:
        print "Done!"
        continue = False

Recursively iterate over multiple web pages and scrape using selenium

Answers (2)

Related Questions