Juicy Samurai
Juicy Samurai

Reputation: 47

Why does Selenium prevent me from printing the text from "ul" and "li" elements?

This is my code:

from selenium import webdriver                  
from selenium.webdriver.common.keys import Keys                                             # For being able to input key presses
import time                                                                                 # Useful for if your browser is faster than your code
PATH = r"C:\Users\Alireza\Desktop\chromedriver\chromedriver.exe"                            # Location of the chromedriver
driver = webdriver.Chrome(PATH)

driver.get("https://de.wikipedia.org/wiki/Alpha-Beta-Suche")                                # Open website in Chrome
print(driver.title)                                                                         # Print title of the website to console

x = 1 #Debug variable

litwebs = driver.find_elements_by_xpath("//ul")                                                  # Literatur and Weblinks body 
for lit in litwebs:
    try:
        l = lit.find_elements_by_tag_name("li")
        print("\n" + l.text)
    except:
        print("\n no li")


driver.quit()  

Unfortunately, it always jumps to the except block and therefore prints "no li" instead of printing the text from the "li" elements.

There is text in them, you can check yourself, this is about the literature and weblinks blocks in Wikipedia. Also, if I do not use the try/except block it will just throw this error message:

Traceback (most recent call last):
  File "c:\Users\Alireza\Desktop\workspace\webscraping\Literaturverzeichnis.py", line 13, in <module>
    print("\n" + litwebs.text)
AttributeError: 'list' object has no attribute 'text'

I really don't understand this.

Upvotes: 0

Views: 181

Answers (4)

cruisepandey
cruisepandey

Reputation: 29362

Change this line

l = lit.find_elements_by_tag_name("li")

to

l = lit.find_element_by_tag_name("li")

Explanation : litwebs is a list in Python and Selenium. and what you are doing wrong here is using this line l = lit.find_elements_by_tag_name("li") this basically tells, for each element from litwebs find a new list of web elements with li. and you are doing the .text on list which does not make any sense, python would throw the error but since you have except it will go to except block.

Upvotes: 1

Dee Gee
Dee Gee

Reputation: 1

I think you have to use the .find_elements_by_ method and print those results.

Upvotes: 0

Hammad
Hammad

Reputation: 617

The issue is with this line of code l = lit.find_elements_by_tag_name("li") as it returns an iterable. Use the following code to get what you need:

x = 1 #Debug variable

litwebs = driver.find_elements_by_tag_name("ul")                                                # Literatur and Weblinks body 
for lit in litwebs:
    try:
        l = lit.find_elements_by_tag_name("li")
        for ll in l:
            print(ll.text)
        print("\n" + l.text)
    except:
        print("\n no li")
    print('\n\n')

Upvotes: 1

Prophet
Prophet

Reputation: 33361

You are getting the

litwebs = driver.find_elements_by_xpath("//ul")

Immediately after opening the URL with

driver.get("https://de.wikipedia.org/wiki/Alpha-Beta-Suche")

You are trying to find ul elements before the page get fully rendered, actually no ul elements are existing at that moment.
Also, for each lit you are getting a list l.
You should iterate over each element inside l to get it's text.
To make your code working you can simply put a hardcoded sleep, like:

from selenium import webdriver                  
from selenium.webdriver.common.keys import Keys                                             # For being able to input key presses
import time                                                                                 # Useful for if your browser is faster than your code
PATH = r"C:\Users\Alireza\Desktop\chromedriver\chromedriver.exe"                            # Location of the chromedriver
driver = webdriver.Chrome(PATH)

driver.get("https://de.wikipedia.org/wiki/Alpha-Beta-Suche")    
time.sleep(10) 
print(driver.title)                                                                         

x = 1 #Debug variable

litwebs = driver.find_elements_by_xpath("//ul")                                                  # Literatur and Weblinks body 
for lit in litwebs:
    try:
        ll = lit.find_elements_by_tag_name("li")
        for l in ll:    
            print("\n" + l.text)
    except:
        print("\n no li")


driver.quit()

But the better way is to use explicit wait implemented with expected conditions

Upvotes: 1

Related Questions