Reputation: 143
I like to find the fastest way (other than looping every element of the array) to find if a $string ='hello my name is john'
contains any of the words on a $array = ['none','tomatos','john']
and in this case does not contain any of the black listed words $black_array = ['find','other']
in this example the result should be True.
Currently i loop every element of the array and use $string.search($array[i])
Upvotes: 0
Views: 614
Reputation: 1264
Other response are adequate for relatively small array sizes. But if you want to optimize this for the case, say, where you have millions of candidate substrings in your array (which is generally the impetus for optimization), I would suggest constructing a "string tree."
In this tree, the root branches to each candidate first byte. So, for example if the array contained only {"none", "tomatoes", "john"}
, there would be three child nodes from the root: 'n', 't', and 'j'
. (Multiple candidate strings starting with the first byte would descend through the same tree node.) Then those nodes in turn branch to candidate second bytes. And so on.
This approach allows you to make a small number of comparisons for each byte in your string: always less than 255, but generally much much fewer. In comparison, if you have a million candidate strings, using the other proposed approaches (including the accepted one), in the worst case (which again is the case we generally optimize against) you will have to make millions of comparisons for each string byte.
Upvotes: 1
Reputation:
You could use Array.join()
and RegExp()
. For example:
let $string ='hello my name is john';
let $black_string ='hello my name is other';
let $black_string2 ='hello my name is mother';
let $array = ['none','tomatos','john'];
let $black_array = ['find','other'];
let re = new RegExp("\\b"+$array.join("\\b|\\b")+"\\b");
let re_blck = new RegExp("\\b"+$black_array.join("\\b|\\b")+"\\b");
let $hasArray = re.test($string);
let $hasBlackArray = re_blck.test($string);
console.log($hasArray,$hasBlackArray);
$hasArray = re.test($black_string);
$hasBlackArray = re_blck.test($black_string);
console.log($hasArray,$hasBlackArray);
$hasArray = re.test($black_string2);
$hasBlackArray = re_blck.test($black_string2);
console.log($hasArray,$hasBlackArray);
Upvotes: 0
Reputation: 14891
You could do this using multiple condition check
const array = ["none", "tomatos", "john"]
const black_array = ["find", "other"]
const check = (str) =>
[
array.some((word) => str.includes(word)),
black_array.every((word) => !str.includes(word)),
].every((criteria) => criteria === true)
console.log(check("hello my name is john"))
console.log(check("hello my name is other"))
console.log(check("hello my name is peter"))
Upvotes: 0
Reputation: 164897
Sounds like you could use a combination of Array.prototype.some()
for the allowed words and a negated Array.prototype.every()
for the banned ones.
const wordToRegex = word => new RegExp(`\\b${word}\\b`, "i")
const check = (str, allowed = [], banned = []) =>
allowed.some(word => wordToRegex(word).test(str))
&& banned.every(word => !wordToRegex(word).test(str))
const allowed = ['none','tomatos','john']
const banned = ['find','other']
console.info(check('hello my name is john', allowed, banned))
console.info(check('hello my other name is john', allowed, banned))
Not sure I'd call this fast though. You'd be better off with an actual indexing search engine.
Upvotes: 1
Reputation: 147206
One way to implement this would be to create regex's out of each array and test for a match in the whitelist and no match in the blacklist:
const $string = 'hello my name is john';
const $array = ['none', 'tomatos', 'john'];
const $black_array = ['find', 'other']
const white = new RegExp('\\b(' + $array.join('|') + ')\\b');
const black = new RegExp('\\b(' + $black_array.join('|') + ')\\b');
const match = white.test($string) && !black.test($string);
console.log(match);
Upvotes: 2