Bangbangbang
Bangbangbang

Reputation: 560

Python/Selenium - how to loop through hrefs in <li>?

Web URL: https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times

I want to parse the HTML as below:

enter image description here

I want to get all hrefs within the < li > elements and the highlighted text. I tried the code

elementList = driver.find_element_by_class_name('block-wysiwyg').find_elements_by_tag_name("li")
for i in range(len(elementList)):
    driver.find_element_by_class_name('blcokwysiwyg').find_elements_by_tag_name("li").get_attribute("href")

But the block returned none.

Can anyone please help me with the above code?

Upvotes: 1

Views: 148

Answers (2)

SIM
SIM

Reputation: 22440

I suppose it will fetch you the required content.

import requests
from bs4 import BeautifulSoup

link = 'https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times'

r = requests.get(link)
soup = BeautifulSoup(r.text,"html.parser")
for item in soup.select(".block-wysiwyg li"):
    item_text = item.get_text(strip=True)
    item_link = item.select_one("a[href]").get("href")
    print(item_text,item_link)

Upvotes: 2

Jack Fleeting
Jack Fleeting

Reputation: 24930

Try is this way:

coronas = driver.find_element_by_xpath("//div[@class='block-wysiwyg']/ul/li")
hr = coronas.find_element_by_xpath('./a')
print(coronas.text)
print(hr.get_attribute('href'))

Output:

The coronavirus is touching the lives of all Americans, but race, age, and income play a big role in the exact ways the virus — and the stalled economy — are affecting people. Here's what that means.
https://www.ipsos.com/en-us/america-under-coronavirus

Upvotes: 1

Related Questions