HJKWON
HJKWON

Reputation: 1

Page navigation using Selenium

I would like to web scrape car reviews on the below webpage for personal interests

www.cardekho.com/user-reviews/maruti-alto-800

I succeeded in scraping reviews on the first page with the below codes

pip install selenium
pip install webdriver-manager
import selenium
from selenium import webdriver
from selenium.webdriver import ActionChains

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome('chromedriver.exe')

url = 'https://www.cardekho.com/user-reviews/maruti-alto-800'
driver.get(url)

reviews = driver.find_elements(By.CSS_SELECTOR, ".contentspace")


for i in reviews:
    i_title = i.find_element(By.CSS_SELECTOR, "h3 > a")
    i_desc = i.find_element(By.CSS_SELECTOR, "p")
    print(i_title.text, i_desc.text)

But I do not seem to be able to scrape all the other remaining reviews on the next pages They range from 1 to 16 and they include "next".

  1. Could you please help me with scraping from all the other pages
  2. I would like to include star metrics with each review in my dataframe. Any way to scrape them as well?

I tried the below codes selecting the main part of "page bar" But page_bar[0] got me page#6 and more than [0] would give me "list out of range"

page_bar = driver.find_elements(By.CSS_SELECTOR, '#rf01 > div.app-content > div > div:nth-child(1) > main > div > div.gsc_col-xs-12.gsc_col-sm-12.gsc_col-md-8.gsc_col-lg-9 > div:nth-child(3) > section > div > div.marginTop20 > div > div > div > ul')
for i in page_bar:
    print(i.text)

page_bar[0].click()

Upvotes: 0

Views: 228

Answers (1)

Sin Han Jinn
Sin Han Jinn

Reputation: 684

If you click on the next pages, you will notice the link containing the page numbers.

Eg Page 2: https://www.cardekho.com/user-reviews/maruti-alto-800/2?subtab=latest

Eg Page 3: https://www.cardekho.com/user-reviews/maruti-alto-800/3?subtab=latest

Therefore, to complete your task you just need to add a for loop going through pages 1-16 by changing the number in the link and you would have scraped all the pages you needed.

For example,

for i in range(1, 16):
    CurrentLinkIs = "https://www.cardekho.com/user-reviews/maruti-alto-800/" + str(i) + "?subtab=latest"
    #perform your scraping here
    #.
    #.
    #.

Upvotes: 1

Related Questions