Reputation: 43
I am trying to scrape this website using python programming language and selenium. I was able to scrape data easily without for loop but whenever I use for loop to scrape elements I get errors, I also tried using while loop with try and except but it was no help at all. This is my python code:
from logging import exception
from typing import Text
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
import time
import pandas as pd
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import csv
from selenium import webdriver
PATH = "C:/ProgramData/Anaconda3/scripts/chromedriver.exe" #always keeps chromedriver.exe inside scripts to save hours of debugging
driver =webdriver.Chrome(PATH) #preety i,portant part
driver.get("https://www.gharghaderi.com/")
driver.implicitly_wait(10)
house = driver.find_elements_by_class_name('griddetails')
for x in house:
driver.get(x)
print(x.text)
and this is the error I am constantly getting error after using for loop
Upvotes: 2
Views: 101
Reputation: 29382
When you write this :
for x in house:
this means, for every x in house list.
your house list contains, all the Web elements with class griddetails
and in loop you are doing
driver.get(x)
which means you want to open every web element, which is wrong.
moreover get()
supports URL in string format.
instead if you want to just print the details you can do this :
house = driver.find_elements_by_class_name('griddetails')
for x in house:
print(x.text)
this should give you proper output.
sample output :
रु. 2,50,00,000
Land: 0-4-0-4 Road: 12 ft
Chapali 2 Budhanilkantha, Kathmandu
Chandra Bahadur Karki
ID 472
Update 1 :
house_list = []
house = driver.find_elements_by_class_name('griddetails')
for x in house:
house_list.append(x.text)
data = {
'Details': house_list
}
df = pd.DataFrame.from_dict(data)
df.to_csv('out.csv', index = 0)
Imports :
import pandas as pd
Upvotes: 2
Reputation: 21
Your error indicates an issue with the driver.get(x) line inside the for loop (line 19). driver.get() expects an URL to open. However, I believe you pass it HTML bits with class name griddetails. What you want is the text inside that HTML.
Inside the loop try printing x or x.text and see what x is. Then you should try to find out how to extract the text you want. img Looks like the text you want is inside the span tag. So try looking there and find a way to extract the text from there. Sorry, I can't test the code myself atm.
Upvotes: 0