0xbe5077ed
0xbe5077ed

Reputation: 4765

How to select all *renderable* text elements in browser

How can I select all visible renderable HTML text nodes in a browser document?

In other words, how can I get a list of DOM nodes I can traverse via scripting in order to obtain the text that is actually visible to the user in the browser, in document order?

I would like to rely on the browser to tell me the nodes that constitute currently visible renderable text. I'm not sure where to start. Help?

Upvotes: 1

Views: 123

Answers (2)

Rick Hitchcock
Rick Hitchcock

Reputation: 35670

This is tricky, but here's what I've come up with:

function traverse(o) {
  var a = [];
  [].forEach.call(o.childNodes, function(val) {
    if(val.nodeType===3) {
      if(val.nodeValue.trim()>'') a.push(val);
    }
    else {
      var style= getComputedStyle(val);
      if(val.tagName!=='NOSCRIPT' && 
         style.getPropertyValue('display')!=='none' &&
         style.getPropertyValue('visibility')!=='hidden' &&
         style.getPropertyValue('opacity')!=='0' &&
         style.getPropertyValue('color')!==style.getPropertyValue('background-color')
        ) {
        a= a.concat(traverse(val));
      }
    }
  });
  return a;
} //traverse

var textNodes= traverse(document.body);

Working Fiddle

This does not check if text nodes are hidden behind other elements or if they are absolutely positioned offscreen.

Upvotes: 2

Dave
Dave

Reputation: 10924

You should be able to do this in 1 line of JavaScript:

document.querySelector("body").innerText

Upvotes: 1

Related Questions