Reputation: 15
I'm scraping this website( http://rera.rajasthan.gov.in/ProjectSearch) using Python and Selenium. I have the code working but it currently only scrapes the first page, I would like to iterate through all the pages and scrape all the VIEW present in them, but they handle pagination in a weird way how would I go through the pages and scrape them one by one?
My source code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException, WebDriverException
import time
opt = webdriver.ChromeOptions()
opt.add_argument("--ignore-certificate-errors")
opt.add_argument("--start-maximized")
driver = webdriver.Chrome(executable_path=r"C:\Users\fit foodie\PycharmProjects\Selenium\Browser\chromedriver.exe", options=opt)
driver.get(url="http://rera.rajasthan.gov.in/")
search= driver.find_element_by_xpath("//*[@id='liSearch']/a").click()
proj_src=driver.find_element_by_xpath("//*[@id='liSearch']/ul/li[1]/a").click()
search_btn = driver.find_element_by_xpath('//*[@id="btn_SearchProjectSubmit"]').click()
def page():
while True:
try:
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[@id='OuterProjectGrid']/div[4]/div[4]/a"))))
driver.find_element_by_xpath("//*[@id='OuterProjectGrid']/div[4]/div[4]/a").click()
print("Navigating to Next Page")
except (TimeoutException, WebDriverException) as e:
print("Last page reached")
break
unable to paginate through this
Upvotes: 0
Views: 270
Reputation: 33384
For Pagination
Use the following css
selector and provide delay after each click.
def page():
i=2
while True:
try:
driver.execute_script("arguments[0].scrollIntoView();", WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.CSS_SELECTOR, "a[data-p='{}']".format(i)))))
driver.find_element_by_css_selector("a[data-p='{}']".format(i)).click()
print("Navigating to Next Page " + str(i))
i=i+1
time.sleep(1)
except (TimeoutException, WebDriverException) as e:
print("Last page reached")
break
page()
Output: console snapshot
If your objectives to get all table data from all pages you can do that without selenium as well.You can try python requests
module and use post request.
import requests
data={
"PageSize" :1250,
"page": 1
}
res=requests.post("http://rera.rajasthan.gov.in/Home/GetProjectsList",data=data).json()
for item in res['Data']['Items']:
print(item['DistrictName'],item['ProjectName'],item['ProjectTypeName'],item['PromoterName'],item['ApplicationNo'],item['CertificateNo'])
Output for all pages like that.
Jaipur ATHARV APPARTMENT Group Housing SHP HOME LLP Revoked Project Revoked Project
Jaipur JVJ DREAM RESIDENCY Group Housing JVJ DREAM DEVELOPERS LLP RAJ-RERA-APP-P-2020-2214 (19/03/2020) RAJ/P/2020/1262 (29/05/2020)
Chittorgarh SHARDA ROYAL GREENS Plotted Development Choudhary Infraheight Private Limited RAJ-RERA-APP-P-2020-2201 (17/03/2020) RAJ/P/2020/1261 (29/05/2020)
Tonk GREEN CITY-A BLOCK Plotted Development SUN INDIA REALHOME LLP RAJ-RERA-APP-P-2020-2173 (04/03/2020) RAJ/P/2020/1260 (29/05/2020)
Ajmer Dream Homz Group Housing G S DREAMHOME LLP RAJ-RERA-APP-P-2020-2188 (13/03/2020) RAJ/P/2020/1259 (20/05/2020)
Jaipur KEDIA'S AMARA Group Housing KEDIA BUILDERS AND COLONIZERS PRIVATE LIMITED RAJ-RERA-APP-P-2020-2224 (13/05/2020) RAJ/P/2020/1258 (18/05/2020)
Jaipur Kuber Garden Group Housing PUNIT ESTATES PRIVATE LIMITED RAJ-RERA-APP-P-2020-2221 (29/04/2020) RAJ/P/2020/1257 (04/05/2020)
Kota SHUBH SAVERA Plotted Development SANTOSH SAINI RAJ-RERA-APP-P-2020-2222 (29/04/2020) RAJ/P/2020/1256 (02/05/2020)
Udaipur MIRACLE Group Housing BHOOMISHIV BUILDERS LLP RAJ-RERA-APP-P-2020-2117 (15/02/2020) RAJ/P/2020/1255 (02/05/2020)
Jaipur NANDAN PRIME VILLAS Group Housing NARENDRA KUMAR AGARWAL RAJ-RERA-APP-P-2020-2184 (11/03/2020) RAJ/P/2020/1254 (28/04/2020)
Jaipur Akshat Kanota Estate-Phase 3 Group Housing AKSHAT APARTMENTS PRIVATE LIMITED RAJ-RERA-APP-P-2020-2052 (24/01/2020) RAJ/P/2020/1253 (20/04/2020)
Jaipur SHREE RADHA KRISHNA APARTMENT Group Housing GURUSAIKRIPA BUILDERS LLP RAJ-RERA-APP-P-2020-2213 (19/03/2020) RAJ/P/2020/1252 (16/04/2020)
Jodhpur Mangaldeep Darshan Group Housing Mangaldeep DaRSHAN RAJ-RERA-APP-P-2020-2186 (12/03/2020) RAJ/P/2020/1251 (16/04/2020)
Sri Ganganagar SHREENATH ENCLAVE Plotted Development ANANDAM HEIGHTS DEVELOPERS PRIVATE LIMITED RAJ-RERA-APP-P-2020-2144 (27/02/2020) RAJ/P/2020/1250 (16/04/2020)
Jaipur SHEKHAWAT CREST Group Housing M R S B INFRA PROJECT PRIVATE LIMITED RAJ-RERA-APP-P-2020-2181 (11/03/2020) RAJ/P/2020/1249 (12/04/2020)
Kota S.S. TIRUPATI TOWER Mixed (Residential And Commercial) S S TIRUPATI INFRAPROJECTS RAJ-RERA-APP-P-2020-2123 (18/02/2020) RAJ/P/2020/1248 (12/04/2020)
Jhalawar Green Villas Group Housing CHAUDHARY BHOORAMAL DEVELOPERS RAJ-RERA-APP-P-2020-2139 (25/02/2020) RAJ/P/2020/1247 (09/04/2020)
Ajmer Samriddhi's Dynasty Group Housing SANKALP REALMART PVT LTD RAJ-RERA-APP-P-2020-2073 (01/02/2020) RAJ/P/2020/1246 (27/03/2020)
Udaipur ARCHI'S LOTUS PARK Group Housing ARCHI BUILDMART PRIVATE LIMITED RAJ-RERA-APP-P-2020-2171 (03/03/2020) RAJ/P/2020/1245 (27/03/2020)
Alwar KRISHAN KUNJ Plotted Development CHHOTE LAL MEENA RAJ-RERA-APP-P-2020-2067 (29/01/2020) RAJ/P/2020/1244 (27/03/2020)
Jodhpur SHANKHESHWAR NAGAR Plotted Development BALWANT RAM RAJ-RERA-APP-P-2020-2095 (10/02/2020) RAJ/P/2020/1243 (27/03/2020)
Jodhpur VEERPRATAP INDUSTRIAL PARK Plotted Development VICTORIA INFRA HOLDINGS PRIVATE LIMITED RAJ-RERA-APP-P-2019-1699 (23/10/2019) RAJ/P/2020/1242 (27/03/2020)
Jaipur Ram Awas Group Housing Shubhashish Builders and Developers RAJ-RERA-APP-P-2020-2023 (17/01/2020) RAJ/P/2020/1241 (27/03/2020)
Sikar SHREE HANUMAN HEIGHTS Commercial MAHADEV BUILDERS AND DEVELOPERS RAJ-RERA-APP-P-2020-2166 (03/03/2020) RAJ/P/2020/1240 (27/03/2020)
Sikar MADHUVAN HOMES Group Housing RAJENDRA SINGH KHICHAR RAJ-RERA-APP-P-2020-2155 (02/03/2020) RAJ/P/2020/1239 (27/03/2020)
Baran SUMERU SOHAM Mixed (Residential And Commercial) SUMERU LIFE SPACE INDIA PRIVATE LIMITED RAJ-RERA-APP-P-2020-2172 (03/03/2020) RAJ/P/2020/1238 (27/03/2020)
Jodhpur ASHAPURNA ANMOL PHASE-I Group Housing ASHAPURNA BUILDCON LIMITED RAJ-RERA-APP-P-2020-2090 (07/02/2020) RAJ/P/2020/1237 (27/03/2020)
Sirohi AYODHYAPURAM SHEOGANJ Group Housing RAMBHADEEP BUILDCON PRIVATE LIMITED RAJ-RERA-APP-P-2020-2111 (14/02/2020) RAJ/P/2020/1236 (27/03/2020)
Jaipur Bhavyaa Green Zenith Group Housing BHAVYAA GREEN BUILDERS RAJ-RERA-APP-P-2020-2163 (03/03/2020) RAJ/P/2020/1235 (20/03/2020)
Dholpur G.K. CITY Group Housing G K Builders RAJ-RERA-APP-P-2020-2065 (29/01/2020) RAJ/P/2020/1234 (20/03/2020)
Udaipur ARCHI'S PEARL PARADISE Group Housing ARCHI CIVIL CONSTRUCTION PRIVATE LIMITED RAJ-RERA-APP-P-2020-2142 (27/02/2020) RAJ/P/2020/1233 (20/03/2020)
Jaipur Stareef Suites 88 Group Housing Arihant Prime Buildtech LLP RAJ-RERA-APP-P-2020-2083 (05/02/2020) RAJ/P/2020/1232 (20/03/2020)
Jaipur HARITWAL CITY - D Plotted Development BHARURAM JAT RAJ-RERA-APP-P-2020-2119 (17/02/2020) RAJ/P/2020/1231 (19/03/2020)
Jodhpur CMJAY LORDI PANDIT JI PACKAGE-10 JODHPUR Group Housing JODHPUR DEVELOPMENT AUTHORITY RAJ-RERA-APP-P-2020-2191 (13/03/2020) RAJ/P/2020/1230 (18/03/2020)
Jaipur Vedic Villas Phase- II Group Housing KEDIA BUILDERS AND COLONIZERS PRIVATE LIMITED RAJ-RERA-APP-P-2020-2169 (03/03/2020) RAJ/P/2020/1229 (12/03/2020)
Tonk SHREE GANESH VATIKA Plotted Development RAM KRISHAN COLONIZERS AND DEVELOPEPRS PRIVATE LIMITED RAJ-RERA-APP-P-2020-2158 (02/03/2020) RAJ/P/2020/1228 (11/03/2020)
Jaipur Vinayak Residency A+B+C (Extension) Plotted Development Vinayak Developers RAJ-RERA-APP-P-2020-2092 (10/02/2020) RAJ/P/2020/1226 (11/03/2020)
Jaipur NIRANJAN VIHAR EXTENSION Plotted Development SHRI GOVARDHAN ESTATES PRIVATE LIMITED RAJ-RERA-APP-P-2020-2099 (11/02/2020) RAJ/P/2020/1225 (11/03/2020)
Jaipur SHREE PARSHVANATH ENCLAVE Group Housing PARSHVANATH INFRA PROJECT RAJ-RERA-APP-P-2020-2030 (21/01/2020) RAJ/P/2020/1224 (11/03/2020)
Jaipur Vrinda Gardens Phase V Group Housing Vista Housing RAJ-RERA-APP-P-2020-2097 (11/02/2020) RAJ/P/2020/1223 (06/03/2020)
Jaipur Ashiana Amantran Phase II Group Housing Ashiana Housing Limited RAJ-RERA-APP-P-2020-2125 (19/02/2020) RAJ/P/2020/1221 (06/03/2020)
Jaipur MANGLAM AANANDA PHASE III (B) Group Housing MANGLAM BUILD DEVELOPERS LIMITED RAJ-RERA-APP-P-2020-2152 (29/02/2020) RAJ/P/2020/1220 (06/03/2020)
Sirohi Karan Heights Group Housing Samdarshi Builders RAJ-RERA-APP-P-2020-2043 (23/01/2020) RAJ/P/2020/1219 (04/03/2020)
Alwar Krish City Centre Commercial Narmada Asbestos Pipes Private Limited RAJ-RERA-APP-P-2020-2021 (16/01/2020) RAJ/P/2020/1218 (04/03/2020)
Bhilwara OSTWAL EMPIRE-1 Plotted Development KULDEEP UMRAOSINGH OSTWAL RAJ-RERA-APP-P-2020-2040 (22/01/2020) RAJ/P/2020/1217 (04/03/2020)
Bhilwara OSTWAL EMPIRE-2 Plotted Development UMRAOSINGH PRITHVIRAJ OSTWAL RAJ-RERA-APP-P-2020-2039 (22/01/2020) RAJ/P/2020/1216 (04/03/2020)
Kota AKANSHA DEEP HEIGHTS Group Housing AKANSHA INFRA HOUSING PROJECTS RAJ-RERA-APP-P-2020-2122 (17/02/2020) RAJ/P/2020/1215 (04/03/2020)
Jodhpur NAKSHATRA Group Housing VISION ASSOCIATES RAJ-RERA-APP-P-2020-2070 (31/01/2020) RAJ/P/2020/1214 (03/03/2020)
Jodhpur CMJAY CHOKHA JODHPUR Group Housing JODHPUR DEVELOPMENT AUTHORITY RAJ-RERA-APP-P-2019-1514 (26/07/2019) RAJ/P/2020/1213 (02/03/2020)
Bikaner Shanti Nilay Group Housing Shanti Infrapromoters Private Limited RAJ-RERA-APP-P-2020-2036 (22/01/2020) RAJ/P/2020/1212 (02/03/2020)
Jaipur GOVINDAM TOWER Group Housing BRIJHARI HOMES LLP RAJ-RERA-APP-P-2020-2089 (07/02/2020) RAJ/P/2020/1208 (24/02/2020)
Jaipur Mukhya Mantri Rajya Sahayak Awasiya Karamchari Yojana Group Housing RAJASTHAN HOUSING BOARD RAJ-RERA-APP-P-2020-2126 (19/02/2020) RAJ/P/2020/1207 (21/02/2020)
Jaipur Ayush Market Plotted Development RAJASTHAN HOUSING BOARD RAJ-RERA-APP-P-2020-2128 (20/02/2020) RAJ/P/2020/1206 (21/02/2020)
Jaipur Kedia's The Oxygen Phase II Group Housing Radha Govind Colonizers RAJ-RERA-APP-P-2020-2103 (11/02/2020) RAJ/P/2020/1205 (19/02/2020)
Alwar Terra Aashray Group Housing Terra Realcon Private Limited RAJ-RERA-APP-P-2019-1530 (31/07/2019) RAJ/P/2020/1204 (19/02/2020)
Jodhpur EWS-335&LIG-153 Houses at Barli Scheme, Jodhpur under MGSY Group Housing RAJASTHAN HOUSING BOARD RAJ-RERA-APP-P-2020-2121 (17/02/2020) RAJ/P/2020/1203 (18/02/2020)
Jaipur GANESH VIHAR Plotted Development BIRDA RAM MEENA RAJ-RERA-APP-P-2019-1801 (24/12/2019) RAJ/P/2020/1202 (18/02/2020)
Jaipur SUMAN ENCLAVE H-BLOCK Plotted Development MS SAMRIDHI BUILDDEV PVT LTD RAJ-RERA-APP-P-2019-1802 (24/12/2019) RAJ/P/2020/1201 (18/02/2020)
Jaipur SOMYA SKY CREST Group Housing SOMYA BUILDHOME LLP RAJ-RERA-APP-P-2020-2062 (28/01/2020) RAJ/P/2020/1200 (17/02/2020)
Kota Neelkanth Residency Plotted Development Kailash Chand Malviya RAJ-RERA-APP-P-2019-1684 (09/10/2019) RAJ/P/2020/1199 (12/02/2020)
Jaipur VEDIC VILLAS PHASE-I Group Housing KEDIA BUILDERS AND COLONIZERS PRIVATE LIMITED RAJ-RERA-APP-P-2020-2072 (31/01/2020) RAJ/P/2020/1198 (12/02/2020)
Jaipur GOVINDAM PARADISE Group Housing BRIJHARI BUILDHOME LLP RAJ-RERA-APP-P-2020-2068 (29/01/2020) RAJ/P/2020/1197 (12/02/2020)
Upvotes: 1
Reputation: 193058
To scrape all the resultant pages from search within the website http://rera.rajasthan.gov.in/ProjectSearch using python and Selenium you need to induce WebDriverWait for the element_to_be_clickable()
and you can use the following Locator Strategies:
Code Block:
driver.get("http://rera.rajasthan.gov.in/ProjectSearch")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='dropdown-toggle' and contains(., 'Search')]"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='dropdown-toggle' and contains(., 'Search')]//following::ul[1]/li/a[text()='Project Search']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@class='btn btn-primary']"))).click()
while True:
try:
WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='ds4u-footer']//div[@class='ds4u-pager']//a")))
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='ds4u-footer']//div[@class='ds4u-pager']//a[contains(@class, 'ds4u-selected')]//following::a[1]/span"))).click()
print("Clicked for next page")
except TimeoutException:
print("No more pages to navigate")
break
driver.quit()
Console Output:
Clicked for next page
Clicked for next page
Clicked for next page
...
...
...
No more pages to navigate
Upvotes: 0
Reputation: 1016
try this ,
def page():
count = 0
while True:
try:
count += 1
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[@id='OuterProjectGrid']/div[4]/div[4]/a[1]"))))
driver.find_element_by_xpath("//*[@id='OuterProjectGrid']/div[4]/div[4]/a["+str(count)+"]").click()
print("Navigating to Next Page")
time.sleep(5)
except (TimeoutException, WebDriverException) as e:
print("Last page reached")
break
page()
Upvotes: 0