Reputation: 47
This is my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys # For being able to input key presses
import time # Useful for if your browser is faster than your code
PATH = r"C:\Users\Alireza\Desktop\chromedriver\chromedriver.exe" # Location of the chromedriver
driver = webdriver.Chrome(PATH)
driver.get("https://de.wikipedia.org/wiki/Alpha-Beta-Suche") # Open website in Chrome
print(driver.title) # Print title of the website to console
x = 1 #Debug variable
litwebs = driver.find_elements_by_xpath("//ul") # Literatur and Weblinks body
for lit in litwebs:
try:
l = lit.find_elements_by_tag_name("li")
print("\n" + l.text)
except:
print("\n no li")
driver.quit()
Unfortunately, it always jumps to the except block and therefore prints "no li" instead of printing the text from the "li" elements.
There is text in them, you can check yourself, this is about the literature and weblinks blocks in Wikipedia. Also, if I do not use the try/except block it will just throw this error message:
Traceback (most recent call last):
File "c:\Users\Alireza\Desktop\workspace\webscraping\Literaturverzeichnis.py", line 13, in <module>
print("\n" + litwebs.text)
AttributeError: 'list' object has no attribute 'text'
I really don't understand this.
Upvotes: 0
Views: 181
Reputation: 29362
Change this line
l = lit.find_elements_by_tag_name("li")
to
l = lit.find_element_by_tag_name("li")
Explanation :
litwebs
is a list in Python and Selenium. and what you are doing wrong here is using this line l = lit.find_elements_by_tag_name("li")
this basically tells, for each element from litwebs
find a new list of web elements with li. and you are doing the .text
on list
which does not make any sense, python would throw the error but since you have except it will go to except block.
Upvotes: 1
Reputation: 1
I think you have to use the .find_elements_by_ method and print those results.
Upvotes: 0
Reputation: 617
The issue is with this line of code l = lit.find_elements_by_tag_name("li")
as it returns an iterable. Use the following code to get what you need:
x = 1 #Debug variable
litwebs = driver.find_elements_by_tag_name("ul") # Literatur and Weblinks body
for lit in litwebs:
try:
l = lit.find_elements_by_tag_name("li")
for ll in l:
print(ll.text)
print("\n" + l.text)
except:
print("\n no li")
print('\n\n')
Upvotes: 1
Reputation: 33361
You are getting the
litwebs = driver.find_elements_by_xpath("//ul")
Immediately after opening the URL with
driver.get("https://de.wikipedia.org/wiki/Alpha-Beta-Suche")
You are trying to find ul
elements before the page get fully rendered, actually no ul
elements are existing at that moment.
Also, for each lit
you are getting a list l
.
You should iterate over each element inside l
to get it's text.
To make your code working you can simply put a hardcoded sleep, like:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys # For being able to input key presses
import time # Useful for if your browser is faster than your code
PATH = r"C:\Users\Alireza\Desktop\chromedriver\chromedriver.exe" # Location of the chromedriver
driver = webdriver.Chrome(PATH)
driver.get("https://de.wikipedia.org/wiki/Alpha-Beta-Suche")
time.sleep(10)
print(driver.title)
x = 1 #Debug variable
litwebs = driver.find_elements_by_xpath("//ul") # Literatur and Weblinks body
for lit in litwebs:
try:
ll = lit.find_elements_by_tag_name("li")
for l in ll:
print("\n" + l.text)
except:
print("\n no li")
driver.quit()
But the better way is to use explicit wait implemented with expected conditions
Upvotes: 1