Reputation: 1
new to python and have started with small web scraping projects. And I have now tries to scrape this url.
I want to collect the information in the blue and white boxes. More specifically, the price (536,25) and the name of the provider (Cheap Energy AB), and I would like to collect the top 3, like in the picture below.
The problem is that the output I get is only for the top alternative:
536,25 öre/kWh, Cheap Energy AB
The output I would like is :
536,25 öre/kWh, Cheap Energy AB
544,45 öre/kWh, Vattenfall AB
544,45 öre/kWh, Vattenfall AB
My code is the following:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
s = Service("/Users/brustabl1/lpthw/chromedriver")
url = "https://www.elpriskollen.se/sv/Avtalssida/?ellevid=236&postnummer=18164&forbrukning=20000&avtalId=31792&prevContractTypeId=20"
driver = webdriver.Chrome(service=s)
driver.maximize_window()
driver.implicitly_wait(10)
driver.get(url)
# Presses the buttton asking if you are an individual or company
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="cookie-customer-type"]/div/div/div/div[3]/nav/ul/li[1]'))).click()
lists = driver.find_elements(By.XPATH, '//*[@id="main"]/div[3]/div[3]/div[2]/ul/li[1]/div[2]/div')
for list in lists:
price = list.find_element(By.XPATH,'//*[@id="main"]/div[3]/div[3]/div[2]/ul/li[1]/div[2]/div/div[1]/div/div/div')
name = list.find_element(By.XPATH, '//*[@id="main"]/div[3]/div[3]/div[2]/ul/li[1]/div[2]/div/div[2]/div/div/div/p[1]')
print(price.text, name.text)
I have tried some things, the first is to put a dot in front of the // for the find_element.( './/'
) but the script doesn't like that. the output I get is :
Message: no such element: Unable to locate element: {"method":"xpath","selector":".//*[@id="main"]/div[3]/div[3]/div[2]/ul/li[1]/div[2]/div/div[1]/div/div/div"}
And right now, I'm kind of stuck.
Upvotes: 0
Views: 284
Reputation: 33361
There are several problems with your code.
//*[@id="main"]/div[3]/div[3]/div[2]/ul/li[1]/div[2]/div
is matching 1 element only so your lists
is actually getting 1 element.lists
are visible you shout collect them.from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options)
url = 'https://www.elpriskollen.se/sv/Avtal/?avtalstypid=20&forbrukning=20000&postnr=18164'
driver.get(url)
wait = WebDriverWait(driver, 20)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.modal-content li.customertype.privat'))).click()
panels = wait.until(EC.visibility_of_all_elements_located((By.CLASS_NAME, 'panel-body')))
for panel in panels:
price = panel.find_element(By.XPATH,".//div[contains(@class,'epk-list-price')]//div[@class='epk-list-cell']")
name = panel.find_element(By.XPATH, ".//p[@class='epk-avtalsinfo'][1]")
print(price.text, name.text)
And this is the output:
536,25 öre/kWh Cheap Energy AB
544,45 öre/kWh Vattenfall AB
544,45 öre/kWh Vattenfall AB
Upvotes: 1