Javascript: Regex vs IndexOf when scanning for a list of keywords

Question

I see that indexOf is faster when you're scanning against one word according to the solution here JavaScript: indexOf vs. Match when Searching Strings?

However, what if you have a list of say 5 keywords and you want to count the occurrence of each (assuming each word appears only once in the string of large text).

Would the below be faster?

var list1 = ['word1', 'word2','word3','word4','word5'];
for (var i = 0; i < list1.length; i++){
     if (exampleLargeText.indexOf(list1[i]) > -1){
    keywordCounter++;
    }
}

vs....

var keywordRegex =  'word1|word2|word3|word4|word5'];  
var keywordCounter = exampleLargeText.toLowerCase().match(new RegExp(SUBMIT_ELEMENT_REGEX , "ig")) || []).length

Is indexOf() still faster despite the fact that you're scanning exampleLargeText 5 times here?

Sam · Accepted Answer

A regular expression like /aaa|bbb|ccc/ will never be more efficient than a simpler (yet still similar, 3 characters) expression like /abc/. This is because regex engines match left to right. The easiest matches would be 'aaa' for the first and 'abc' for the second...each of these take 3 steps. Now, imagine, you try to match 'aabbccx' to both expressions. The first expression would take a total of 33 steps and the second would take 5 steps, this is because each alternation (denoted by |) forces the regex engine to start over. Play around with that on a tool like Regex101.

However, if you were able to make your regular expression more optimized than just checking each word separately, there is a chance it could beat .indexOf(). For instance, if your expression is truly /word1|word2|word3|word4|word5/, it could be rewritten as /word[1-5]/. This is way more efficient than looking for each word separately because now the expression is defined in a simple pattern. Who knows, though, .indexOf() still may be faster depending on the overhead.

That's when benchmarking comes into play -- use jsPerf!

Javascript: Regex vs IndexOf when scanning for a list of keywords

Answers (1)

Related Questions