Reputation:
I am trying to print the result on the terminal but getting this error message:
IndexError: list index out of range
Below is the code, thanks in advance for your help. Truly beginner to this field.
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
option = Options()
option.add_argument("--disable-infobars")
option.add_argument("start-maximized")
option.add_argument("--disable-extensions")
option.add_experimental_option("excludeSwitches", ['enable-automation'])
# Pass the argument 1 to allow and 2 to block
option.add_experimental_option("prefs", {
"profile.default_content_setting_values.notifications": 1
})
driver = webdriver.Chrome(chrome_options=option, executable_path='C:\\Users\\Sheik\\Desktop\\web crawling\\chromedriver.exe')
driver.implicitly_wait(5000)
url = "https://www.yell.com/"
driver.get(url)
search_query_path = driver.find_element_by_xpath('''//*[@id="search_keyword"]''')
search_query_path.click()
search_query_path.send_keys("Garage Services")
search_city_path = driver.find_element_by_xpath('''//*[@id="search_location"]''')
search_city_path.click()
search_city_path.send_keys("London")
search_btn = driver.find_element_by_xpath('''//*[@id="searchBoxForm"]/fieldset/div[1]/div[3]/button''')
search_btn.click()
names = driver.find_elements_by_class_name("businessCapsule--name")
address = driver.find_elements_by_class_name("businessCapsule--address")
num_page_items = len(names)
for i in range(num_page_items):
print(f"{names[num_page_items].text} : {address[num_page_items].text}")
driver.close()
Upvotes: 0
Views: 418
Reputation: 193108
You were pretty close. To extract all the names and addresses from the webpage https://www.yell.com/ you have to induce WebDriverWait for the visibility_of_all_elements_located()
and you can use either of the following Locator Strategies:
Code Block:
names = [my_elem.text for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.businessCapsule--name")))]
address = [my_elem.text for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.businessCapsule--address")))]
for i,j in zip(names, address):
print("{} address is {}".format(i, j))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Console Output:
Diagnostic Services And Repairs Ltd address is 7.7 mi | R/O 310 Green Lanes, London, N13 5TT
Maypole Motors Ltd address is 11.5 mi | We serve London | Surbiton, KT6
Allen & Hall Motor Engineers Ltd address is 13.3 mi | We serve London | Romford, RM7
Ryecraft Motors & Body Shop address is 4.1 mi | 1a Old James St, London, SE15 3TS
M & K Garage Services address is 7.5 mi | 4 Oak Grove Rd, London, SE20 7RQ
Daytona Garage address is 4.9 mi | 98 Vale Rd, London, N4 1PZ
Mobile Hybrid Repair Ltd address is 37.8 mi | We serve London | Reading, RG30
C & E Motors address is 5.1 mi | Unit 6 Brookmarsh Trading Estate, Norman Rd, London, SE10 9QE
Stevens Motors address is 8.2 mi | 3a Wellington Rd, London, E11 2AN
M V Motor Repairs address is 5.6 mi | 7 Bidder St, London, E16 4ST
Carpenters Garage address is 6.1 mi | 69 Bickersteth Rd, London, SW17 9SH
Car Care Ealing address is 8.2 mi | 199 Northfield Avenue, London, W13 9QU
German Car Centre address is 7.3 mi | Unit 3, Hyde Estate Road, LONDON, NW9 6JX
Defoe Tyres Ltd address is 4.2 mi | 1a Defoe Rd, London, N16 0EP
Kwik Kar Service Centre address is 3.8 mi | 11 West Hampstead Mews, London, NW6 3BB
DPF Specialist Clinic address is 5.7 mi | 115 Lea Bridge Rd, London, E10 7AG
Blueflash Garage address is 3 mi | 21-25 Bedford Rd, Clapham North, London, SW4 7SH
National Tyres and Autocare address is 7.1 mi | 57 Kingston Road, London, SW19 1JN
National Tyres and Autocare address is 7.8 mi | 92-96 St Marys Road, Ealing, London, W5 5EX
Merton Autotechnics address is 7 mi | Unit 5 Station Rd, London, SW19 2LP
The Old Forge Garage address is 11.9 mi | We serve London | Belvedere, DA17
Green Man Tyre & Exhaust Ltd address is 8.8 mi | 1308 High Rd, London, N20 9HJ
Castle Motors Ltd address is 6.9 mi | The Rear Of Number 1 Sansom Rd, London, E11 3EY
National Tyres and Autocare address is 7.3 mi | We serve London | The Hyde, Hendon, NW9
Walthamstow Village Garage address is 7.3 mi | 28 Ravenswood Rd, London, E17 9LY
Upvotes: 0
Reputation: 1090
You are trying to access the list with the length of it instead of the iterator variable i
print(f"{names[num_page_items].text} : {address[num_page_items].text}")
to
print(f"{names[i].text} : {address[i].text}")
Upvotes: 1
Reputation: 50864
You want to use the index i
to iterate over names
and address
, not the list size
for i in range(num_page_items):
print(f"{names[i].text} : {address[i].text}")
Or just loop on both with zip
for name, ad in zip(names, address):
print(f"{name.text} : {ad.text}")
Upvotes: 4