foxite
foxite

Reputation: 191

Search the HTML document's text for certain strings (and replace those)

I'm writing a Firefox extension. I want to go through the entire plaintext, so not Javascript or image sources, and replace certain strings. I currently have this:

var text = document.documentElement.innerHTML;

var anyRemaining = true;
do {    
    var index = text.indexOf("search");
    if (index != -1) {
        // This does not just replace the string with something else, 
        // there's complicated processing going on here. I can't use 
        // string.replace().
    } else {
        anyRemaining = false;
    }
} while (anyRemaining);

This works, but it will also go through non-text elements and HTML such as Javascript, and I only want it to do the visible text. How can I do this?

I'm currently thinking of detecting an open bracket and continuing at the next closing bracket, but there might be better ways to do this.

Upvotes: 1

Views: 1421

Answers (2)

Kyle
Kyle

Reputation: 4014

You can use xpath to get all the text nodes on the page and then do your search/replace on those nodes:

function replace(search,replacement){
	var xpathResult = document.evaluate(
		"//*/text()", 
		document, 
		null, 
		XPathResult.ORDERED_NODE_ITERATOR_TYPE, 
		null
	);
	var results = [];
	// We store the result in an array because if the DOM mutates
	// during iteration, the iteration becomes invalid.
	while(res = xpathResult.iterateNext()) {
		results.push(res);
	}
	results.forEach(function(res){
		res.textContent = res.textContent.replace(search,replacement);
	})
}

replace(/Hello/g,'Goodbye');
<div class="Hello">Hello world!</div>

Upvotes: 2

Yash Dayal
Yash Dayal

Reputation: 1174

You can either use regex to strip the HTML tags, might be easier to use javascript function to return the text without HTML. See this for more details: How can get the text of a div tag using only javascript (no jQuery)

Upvotes: 0

Related Questions