Reputation: 65
I'm writing a Greasemonkey script to selectively hide elements containing nasty stuff (a personal web sanitizer, if you will).
Here's what I've got so far:
//custom contains function which is case-insensitive
$.extend($.expr[":"], {
"containsNC": function(elem, i, match, array) {
return (elem.textContent || elem.innerText || "").toLowerCase().indexOf((match[3] || "").toLowerCase()) >= 0;
}
});
//build array of words to filter
var nope = "long list of horrible words".toLowerCase().split(' ');
//start with an empty jQuery object
var nopeEles = $();
//add elements to filter to it
for (var i = 0; i < nope.length; i++) {
nopeEles = nopeEles.add( $("a:containsNC('" + nope[i] + "')") );
nopeEles = nopeEles.add( $("p:containsNC('" + nope[i] + "')") );
}
//hide all applicable elements
nopeEles.css("background-color", "white");
nopeEles.css("color", "white");
It works decently, but it does partial word matching, which makes short words not work. I want to filter elements containing words like "die" and "gun", without filtering those with words like "candied" or "gung-ho".
To be clear, I'm after whole-word, not exact-text. I want "gun" in the list to match not just "gun" but also "he fired a gun" and "a gun was fired". And not "gunney sergeant".
Every other answer I've seen on this topic recommends jQuery's filter(). I think I don't understand it well enough. I tried using this line in the loop, but nothing:
nopeEles = nopeEles.add( $("a").filter(function() { return $(this).text() === nope[i]; }) );
The other angle I thought to look at was fiddling with containsNC so it looks for the word, but with whitespace or end-of-string on either side. I don't really get how containsNC works, though.
Any pointers would be hugely appreciated!
Upvotes: 1
Views: 1408
Reputation: 93473
That containsNC
is just a subpar version of this p:containsCI()
jQuery extension.
("NC" == "no case" ≈≈ "CI" == "Case insensitive".)
Use the linked jQuery extension instead and then you can use regex to match whole words like:
nopeEles = nopeEles.add( $("a:containsCI('\\b" + nope[i] + "\\b')") );
However, that question code is rather inefficient and you'll find that it slows the page because it scans the whole page 2N times (where N is the number of terms) multiplied by J substring scans (where J is the number of <a>
and <p>
nodes).
A more performant way is to scan each node only once by merging the regex. See this demo:
jQuery.extend (
jQuery.expr[':'].containsCI = function (a, i, m) {
var sText = (a.textContent || a.innerText || "");
var zRegExp = new RegExp (m[3], 'i');
return zRegExp.test (sText);
}
);
//-- Build array of terms to filter:
var badTerms = ['die', 'guns?', 'agitators?'];
//-- Build ONE regex string for speed and efficiency:
var cnsrRegEx = `\\b(${badTerms.join ("|")})\\b`; // \b is word-break regex.
var nopeEles = $("a, p").filter (":containsCI('" + cnsrRegEx + "')");
//-- Hide all applicable elements:
nopeEles.css ( {
"background-color": "white",
"color": "white"
} );
a, p {border: 1px solid lightgray; padding: 0.3ex 1ex;}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js"></script>
<p>All good</p>
<p>All bad agitators</p>
<div>Some bad: <a>die</a> <a>gun</a> <a>candied</a> <a>gung-ho</a> <a>guns</a>
<a>he fired a gun</a> <a>gunney sergeant</a>
</div>
Note:
guns?
allows matching of both "gun" and "guns".\
characters must be escaped. That is use "\\b"
to get \b
in regex.Upvotes: 2