pythonseleniumselenium-webdriverweb-scrapingselenium-chromedriver

Reputation: 103

'list' object has no attribute 'get_attribute' while iterating through WebElements

I'm trying to use Python and Selenium to scrape multiple links on a web page. I'm using find_elements_by_xpath and I'm able to locate a list of elements but I'm having trouble changing the list that is returned to the actual href links. I know find_element_by_xpath works, but that only works for one element.

Here is my code:

path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

all_trails = []

#finds all elements with the class 'text-truncate trail-name' then 
#retrieve the a element
#this seems to be just giving us the element location but not the 
#actual location

find_href = browser.find_elements_by_xpath('//div[@class="text truncate trail-name"]/a[1]')
all_trails.append(find_href)

print all_trails

This code is returning:

<selenium.webdriver.remote.webelement.WebElement 
(session="dd178d79c66b747696c5d3750ea8cb17", 
element="0.5700549730549636-1663")>, 
<selenium.webdriver.remote.webelement.WebElement 
(session="dd178d79c66b747696c5d3750ea8cb17", 
element="0.5700549730549636-1664")>,

I expect the all_trails array to be a list of links like: www.google.com, www.yahoo.com, www.bing.com.

I've tried looping through the all_trails list and running the get_attribute('href') method on the list but I get the error:

Does anyone have any idea how to convert the selenium WebElement's to href links?

Any help would be greatly appreciated :)

Upvotes: 4

Answers (5)

Jagannath Padhy

Reputation: 1

get_attribute works on elements of that list only, not list itself. For eg :-

def fetch_img_urls(search_query: str):
    driver.get('https://images.google.com/')
    search = driver.find_element(By.CLASS_NAME, "gLFyf.gsfi")
    search.send_keys(search_query)
    search.send_keys(Keys.RETURN)
    links=[]
    try:
        time.sleep(5)
        urls = driver.find_elements(By.CSS_SELECTOR,'a.VFACy.kGQAp.sMi44c.lNHeqe.WGvvNb')
        for url in urls:
            #print(url.get_attribute("href"))
            links.append(url.get_attribute("href"))
            print(links)

    except Exception as e:
        print(f'error{e}')
        driver.quit()

Upvotes: 0

Vijay Anand

Reputation: 41

Use it in Singular form as find_element_by_css_selector instead of using find_elements_by_css_selector as it returns many webElements in List. So you need to loop through each webElement to use Attribute.

Upvotes: 2

Ratmir Asanov

Reputation: 6459

If you have the following HTML:

<div class="text-truncate trail-name">
<a href="http://google.com">Link 1</a>
</div>
<div class="text-truncate trail-name">
<a href="http://google.com">Link 2</a>
</div>
<div class="text-truncate trail-name">
<a href="http://google.com">Link 3</a>
</div>
<div class="text-truncate trail-name">
<a href="http://google.com">Link 4</a>
</div>

Your code should look like:

all_trails = []

all_links = browser.find_elements_by_css_selector(".text-truncate.trail-name>a")

for link in all_links:

    all_trails.append(link.get_attribute("href"))

Where all_trails -- is a list of links (Link 1, Link 2 and so on).

Hope it helps you!

Upvotes: 4

Nimish Bansal

Reputation: 1759

find_href = browser.find_elements_by_xpath('//div[@class="text truncate trail-name"]/a[1]')
for i in find_href:
      all_trails.append(i.get_attribute('href'))

get_attribute works on elements of that list, not list itself.

Upvotes: 2

undetected Selenium

Reputation: 193378

Let us see what's happening in your code :

Without any visibility to the concerned HTML it seems the following line returns two WebElements in to the List find_href which are inturn are appended to the all_trails List :

find_href = browser.find_elements_by_xpath('//div[@class="text truncate trail-name"]/a[1]')

Hence when we print the List all_trails both the WebElements are printed. Hence No Error.

As per the error snap shot you have provided, you are trying to invoke get_attribute("href") method over a List which is Not Supported. Hence you see the error :

'List' Object has no attribute 'get_attribute'

Solution :

To get the href attribute, we have to iterate over the List as follows :

find_href = browser.find_elements_by_xpath('//your_xpath')
for my_href in find_href:
    print(my_href.get_attribute("href"))

Upvotes: 11

&#39;list&#39; object has no attribute &#39;get_attribute&#39; while iterating through WebElements

Answers (5)

Solution :

Related Questions

'list' object has no attribute 'get_attribute' while iterating through WebElements