Maik Hasler
Maik Hasler

Reputation: 1400

How to recieve inner HTML of a child node in selenium (python)?

I'm trying to iterate through multiple nodes and receive various child nodes from the parent nodes. Assuming that I've something like the following structure:

<div class="wrapper">
    <div class="item">
        <div class="item-footer">
            <div class="item-type">Some data in here</div>
        </div>
    </div>
    <!-- More items listed here -->
</div>

I'm able to receive all child nodes of the wrapper container by using the following.

wrapper = driver.find_element(By.XPATH, '/html/body/div')
items = wrapper.find_elements(By.XPATH, './/*')

Anyways I couldn't figure out how I can now receive the inner HTML of the container containing the information about the item type. I've tried this, but this didn't work.

for item in items:
    item_type = item.item.find_element(By.XPATH, './/div/div').get_attribute('innerHTML')
    print(item_type)

This results in the following error:

NoSuchElementException: Message: Unable to locate element:

Does anybody knows how I can do that?

Upvotes: 3

Views: 875

Answers (3)

KunduK
KunduK

Reputation: 33384

You need to just find the relative xpath to identify each element and then iterate it.

items = driver.find_elements(By.XPATH, "//div[@class='wrapper']//div[@class='item']//div[@class='item-type']")
for item in items:
    print(item.text)
    print(item.get_attribute('innerHTML'))

Or use the css selector

items = driver.find_elements(By.CSS_SELECTOR, ".wrapper >.item .item-type")
for item in items:
    print(item.text)
    print(item.get_attribute('innerHTML'))

Upvotes: 1

JayPeerachai
JayPeerachai

Reputation: 3842

You can use BeautifulSoup after getting page source from selenium to easily scrape the HTML data.

from bs4 import BeautifulSoup

# selenium code part
# ....
# ....
# driver.page_source is the HTML result from selenium

html_doc = BeautifulSoup(driver.page_source, 'html.parser')
items = html_doc.find_all('div', attrs={'class':'item'})
for item in items:
    text = item.find('div', attrs={'class':'item-type'}).text
    print(text)

Output:

Some data in here

Upvotes: 1

Prophet
Prophet

Reputation: 33381

In case all the elements their content you want to get are div with class attribute value item-type located inside divs with class attribute value item-footer you can simply do the following:

elements =  driver.find_element(By.XPATH, '//div[@class="item-footer"]//div[@class="item-type"]')
for element in elements:
    data = element.get_attribute('innerHTML')
    print(data)

Upvotes: 1

Related Questions