Reputation: 17
I am trying to get selenium to web scrape the first paragraph of wiki pages using CSS selectors.
When I run this code, it seems to only select ones from the original web page
and not what I am searching for, in this case 'cats'.
Any help with this would be awesome!
browser = webdriver.Firefox(executable_path='D:\Import Files that I also want backed up\Jupyter Notebooks\Python Projects\Selenium\driverss\geckodriver.exe')
browser.get('https://en.wikipedia.org')
search_elem = browser.find_element_by_css_selector('#searchInput')
search_elem.send_keys('cats')
search_elem.submit()
results_elem = browser.find_element_by_css_selector('p')
print(results_elem.text)
output:
Adventure Time is an American fantasy animated television series created .....
Upvotes: 0
Views: 117
Reputation: 33384
To get the first paragraph text from wiki page.Induce WebDriverWait()
and visibility_of_element_located
() and following css
selector.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Firefox(executable_path='D:\Import Files that I also want backed up\Jupyter Notebooks\Python Projects\Selenium\driverss\geckodriver.exe')
browser.get('https://en.wikipedia.org')
search_elem = browser.find_element_by_css_selector('#searchInput')
search_elem.send_keys('cats')
search_elem.submit()
results_elem=WebDriverWait(browser,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"div.mw-parser-output p:nth-of-type(3)")))
print(results_elem.text)
Upvotes: 1