Reputation: 711
I see that indexOf is faster when you're scanning against one word according to the solution here JavaScript: indexOf vs. Match when Searching Strings?
However, what if you have a list of say 5 keywords and you want to count the occurrence of each (assuming each word appears only once in the string of large text).
Would the below be faster?
var list1 = ['word1', 'word2','word3','word4','word5'];
for (var i = 0; i < list1.length; i++){
if (exampleLargeText.indexOf(list1[i]) > -1){
keywordCounter++;
}
}
vs....
var keywordRegex = 'word1|word2|word3|word4|word5'];
var keywordCounter = exampleLargeText.toLowerCase().match(new RegExp(SUBMIT_ELEMENT_REGEX , "ig")) || []).length
Is indexOf() still faster despite the fact that you're scanning exampleLargeText 5 times here?
Upvotes: 1
Views: 744
Reputation: 20486
A regular expression like /aaa|bbb|ccc/
will never be more efficient than a simpler (yet still similar, 3 characters) expression like /abc/
. This is because regex engines match left to right. The easiest matches would be 'aaa' for the first and 'abc' for the second...each of these take 3 steps. Now, imagine, you try to match 'aabbccx' to both expressions. The first expression would take a total of 33 steps and the second would take 5 steps, this is because each alternation (denoted by |
) forces the regex engine to start over. Play around with that on a tool like Regex101.
However, if you were able to make your regular expression more optimized than just checking each word separately, there is a chance it could beat .indexOf()
. For instance, if your expression is truly /word1|word2|word3|word4|word5/
, it could be rewritten as /word[1-5]/
. This is way more efficient than looking for each word separately because now the expression is defined in a simple pattern. Who knows, though, .indexOf()
still may be faster depending on the overhead.
That's when benchmarking comes into play -- use jsPerf!
Upvotes: 2