Reputation: 4891
I find it difficult to get the content of an HTML comment tag <!-- stuff -->
included in the head
tag of an HTML page using python 2.7 and selenium.
<head>
<!-- I would like to get this sentence -->
[...]
</head>
I got the XPath of that comment using FirePath/FireBug (so I am assuming it is correct): html/head/comment()[1]
.
Then:
given_driver.find_element_by_xpath('html/head/comment()[1]')
gives me InvalidSelectorException
saying Message: The given selector html/head/comment()[1] is either invalid or does not result in a WebElement. The following error occurred:
InvalidSelectorError: The result of the xpath expression "html/head/comment()[1]" is: [object Comment]. It should be an element.
head_element = given_driver.find_element_by_xpath('html/head')
then gives me the whole HTML code in the head
tag with head_element.get_attribute('innerHTML')
like: u'<!-- I would like to get this sentence -->\n [...]
But I would like to get just the content of the comment tag inside the head
tag. I am wondering this is not possible with Selenium, but it seems strange to me. How could I get it?
Upvotes: 2
Views: 2451
Reputation: 42518
The Selenium API doesn't support the comment node. However you could easely get the comment with this piece of JavaScript:
head = driver.find_element_by_css_selector("head")
comment = get_element_comment(head)
print(comment)
def get_element_comment(element):
return element._parent.execute_script("""
return Array.prototype.slice.call(arguments[0].childNodes)
.filter(function(e) { return e.nodeType === 8 })
.map(function(e) { return e.nodeValue.trim() })
.join('\n');
""", element)
Upvotes: 3
Reputation: 1895
You have to get the page source and find (parse) the required comment from there. Something like this:
driver.Navigate().GoToUrl("your url");
var src = driver.PageSource;
Then parse src
Upvotes: 0