null
null

Reputation: 9144

How to select all children text but excluding a tag in Selenium's XPath selector?

I have this html:

<div id="content">
    <h1>Title 1</h1><br><br>

    <h2>Sub-Title 1</h2>
    <br><br>
    Description 1.<br><br>Description 2.
    <br><br>

    <h2>Sub-Title 2</h2>
    <br><br>
    Description 1<br>Description 2<br>
    <br><br>

    <div class="infobox">
        <font style="color:#000000"><b>Information Title</b></font>
        <br><br>Long Information Text
    </div>
</div>

I want to get all text in <div id="content"> in Selenium's find_element_by_xpath function but excluding <div class="infobox">'s content, so the expected result is like this:

Title 1


Sub-Title 1


Descripton 1.

Descripton 2.


Sub-Title 2


Descripton 1.
Descripton 2.

I can get it by using this code in online XPath tester:

//div[@id="content"]/descendant::text()[not(ancestor::div/@class="infobox")]

But if I pass the code to selenium's find_element_by_xpath, I will get selenium.common.exceptions.InvalidSelectorException.

result = driver.find_element_by_xpath('//div[@id="content"]/descendant::text()[not(ancestor::div/@class="infobox")]')

Upvotes: 2

Views: 2725

Answers (1)

alecxe
alecxe

Reputation: 474171

The xpath used inside find_element_by_xpath() has to point to an element, not a text node and not an attribute.

The easiest approach here, would be to find the parent tag, find the child tag which text you want to exclude and remove the child's text from the parent's text:

parent = driver.find_element_by_id('content')
child = parent.find_element_by_class_name('infobox')
print parent.text.replace(child.text, '')

Upvotes: 5

Related Questions