Reputation: 4942
I have a piece of HTML like this:
<a href="/something">
Title
<span>Author</span>
</a>
I got a WebElement that matches this HTML. How can I extract only "Title" from it? Method .getText() returns "Title\nAuthor"...
Upvotes: 7
Views: 3803
Reputation: 544
Verify the element present for "//a[normalize-space(text())=Title]
". It will return true if the text present inside 'a' tag is 'Title'.
Upvotes: 0
Reputation: 1464
you can use jsexecutor to iterate the child nodes, trap the textNode 'Title' and then return its content like below
WebElement link = driver.findElement(By.xpath("//a[@href='something']"));
JavascriptExecutor js = ((JavascriptExecutor)driver);
String authorText = (String) js.executeScript("for(var i = 0; i < arguments[0].childNodes.length; i++) {
if(arguments[0].childNodes[i].nodeName == \"#text\") { return arguments[0].childNodes[i].textContent; } }", link);
The javascript code block above iterates both textNode ('Title') and SPAN ('Author') but returns only the text content of textNode.
Note: Previous to this, I have tried including text node in xpath like below, but webdriver throws invalidselector exception as it requires element not textnode
WebElement link = driver.findElement(By.xpath("//a[@href='something']/text()"));
Upvotes: 0
Reputation: 1
If using Python:
[x['textContent'].strip() for x in element.get_property('childNodes') if isinstance(x, dict)]
Where element
is your element.
This will return ['Title', '']
(because there are spaces after span
).
Upvotes: 0
Reputation: 14145
Here is the method developed in python.
def get_text_exclude_children(element):
return driver.execute_script(
"""
var parent = arguments[0];
var child = parent.firstChild;
var textValue = "";
while(child) {
if (child.nodeType === Node.TEXT_NODE)
textValue += child.textContent;
child = child.nextSibling;
}
return textValue;""",
element).strip()
How to use in this:
liElement = driver.find_element_by_xpath("//a[@href='your_href_goes_here']")
liOnlyText = get_text_exclude_children(liElement)
print(liOnlyText)
Please use your possible strategy to get the element, this method need an element from which you need the text (without children text).
Upvotes: 0
Reputation: 9569
You can't do this in the WebDriver API, you have to do it in your code. For example:
var textOfA = theAElement.getText();
var textOfSpan = theSpanElement.getText();
var text = textOfA.substr(0, textOfA.length - textOfSpan.length).trim('\n');
Note that the trailing newline is actually part of the text of the <a>
element, so if you don't want it, you need to strip it.
Upvotes: 7