Reputation: 5922
I'm running Selenium with Firefox in Python, and I'm trying to match Elements on a page from keywords in a list.
For the element lookup to be successful, I need to get rid of some special characters like ® and ™ on the web page. I am unfortunately not able to predict when such characters are employed, and I therefore can't add them on the "keyword end" of the problem.
I don't think that Selenium or Firefox itself can remove unwanted characters from a webpage, but my thought was to have Selenium execute a JavaScript on the page and remove those characters. Is that possible?
Something like this presumably non-working, pseudo-code:
driver.execute_script("document.body.innerHTML.replace(/®/g, '');")
The replacement should happen before the driver tries to "read" the page and find_element
.
FYI the characters I want to get rid of are in <a>
text()
nodes in <td>
cells across the document body.
Upvotes: 1
Views: 1342
Reputation: 329
ASCII is in range of 0 to 127, so you can do it this way:
document.body.innerHTML.replace(/[^\x00-\x7F]/g, '');
If you want to remove only ® you can do it this way:
document.body.innerHTML.replace(/(®)/, '');
Upvotes: 2