Reputation: 20668
I want to query all textnodes from my DOM. However, I don't want to have these "markup-linebreaks", where there is a linebreak between HTML tags.
So I'm trying to translate all whitespaces according to here and check if there're chars left:
/html/body//text()[not(translate(., '	

', '') = '')]
This doesn't work, since it doesn't seams to be possible to check agains empty strings (which kind of makes sense, since it's not a text node then).
Any other approach to filter this nodes?
Upvotes: 2
Views: 1189
Reputation: 243449
Use:
/html/body//text()[normalize-space()]
This selects all text-node descendants of /html/body
each of which has a non-empty string value after normalization.
The above expression uses the standard XPath function normalize-space()
which takes a string (or the string-value of the context-node, if specified with no argument) and returns another one in which all leading and trailing whitespace characters are deleted and any intermediate group of adjacent whitespace characters has been replaced by a single space.
Upvotes: 3