Reputation: 802
My code goes into a webpage and I want to scrape the href/HTML of each listing within this webpage.
(This code goes to a website which has 2)
I tried xpath, and beautifulSoup but it returns an empty list for me.
Here is the code-
import time
from selenium import webdriver
import pandas as pd
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
bracket=[]
driver.get('https://casehippo.com/spa/symposium/national-kidney-foundation-2021-spring-clinical-meetings/event/gallery/?search=Patiromer')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
eachRow=driver.find_element_by_partial_link_text('symposium')
print(eachRow.text)
Upvotes: 0
Views: 75
Reputation: 3537
I just ran the code what you provided, BeautifulSoup set soup variable with all page source successfully:
soup = BeautifulSoup(page_source, 'html.parser')
and in the next line:
eachRow=driver.find_element_by_partial_link_text('symposium')
exception has been raised with message:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"partial link text","selector":"symposium"}
seems like you're using incorrect selector, try to use, somethink like:
element = driver.find_element_by_xpath("//a[@class='title ng-binding']")
code what i'm using:
import time
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
try:
bracket = []
driver.get(
'https://casehippo.com/spa/symposium/national-kidney-foundation-2021-spring-clinical-meetings/event/gallery/?search=Patiromer')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
print(soup)
element = driver.find_element_by_xpath("//a[@class='title ng-binding']")
print(element.get_attribute('href'))
elements = driver.find_elements_by_xpath("//a[@class='title ng-binding']")
for el in elements:
print(el.get_attribute('href'))
finally:
driver.quit()
Upvotes: 1
Reputation: 3801
Updated code:
import time
from selenium import webdriver
import pandas as pd
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
bracket=[]
driver.get('https://casehippo.com/spa/symposium/national-kidney-foundation-2021-spring-clinical-meetings/event/gallery/?search=Patiromer')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
eachRow=driver.find_elements_by_xpath("//a[contains(@ui-sref,'symposium')]")
for row in eachRow:
print(row.text)
You need to use find_elements (not find_element) if there are more than one, and then iterate over them to see their values. Also partial text wont work because the symposium text is embedded in another element, its not regular text, so xpath is needed
Upvotes: 0