LF-DevJourney
LF-DevJourney

Reputation: 28554

Beautifulsoup parse Selenium element

I use soup = BeautifulSoup(driver.page_source) to parse the whole page from Selenium in BeautifulSoup.

But how to just parse one element of Selenium in BeautifulSoup.

Below code will throw

TypeError: object of type 'FirefoxWebElement' has no len()

element = driver.find_element_by_id(id_name)
soup = BeautifulSoup(element)

Upvotes: 1

Views: 343

Answers (1)

Ahmed I. Elsayed
Ahmed I. Elsayed

Reputation: 2130

I don't know if selenium does this out of the box, but I managed to find this workaround

element_html = f"<{element.tag_name}>{element.get_attribute('innerHTML')}</{element.tag_name}>"

you may want to replace innerHTML with innerTEXT if you want to get only the text, for example

<li>Hi <span> man </span> </li>

Getting the innerHTML will return all of what inside but the innerTEXT won't, try & see.

now create your Soup object

soup = BeautifulSoup(element_html)
print(soup.WHATEVER)

using the above technique, just create a method parseElement(webElement) & use it whenever you want to parse an element.

Btw I only use lxml & when I forgot to type it, the script didn't work

Upvotes: 2

Related Questions