Reputation: 3543
I want the most efficient (in case of speed) solution to remove array of given words from the given string:
So far I have this not working solution:
const excludeWordList = ['the', 'in', 'a', 'an'];
run("the wall into")
run("paintings covered the wall another words into this")
function run(speech) {
for(let a = 0; a < excludeWordList.length; a++) {
speech = speech.replaceAll(excludeWordList[a], '');
}
console.log(speech);
}
As you see the code has three major issues:
It removes characters inside the words not just the single words
The result is not a trimmed string we have extra spaces inside words of the result too
The code is not the most efficient way I think!!!, because I need to loop through all the excludeWordList
array.
I wrote my function as my
and as you see the Gainza
function is the most efficient function in this case:
Upvotes: 0
Views: 198
Reputation: 18901
The problem with a solution using whitespaces as word separator is that it will most likely fail when you use punctuation for example e.g.
'the, Beattles'.split(' ').includes('the')
//=> false
Instead you should use \b
(word boundary):
const excludes = ['the', 'in', 'a', 'an'];
const re = new RegExp('\\b(?:'+excludes.join('|')+')\\b', 'g');
console.log("the.wall.into".replace(re, ''));
console.log("paintings covered the, wall another words into this".replace(re, ''));
Upvotes: 1
Reputation: 23654
Here's a way that doesn't loop, replaces all excluded words with a single regex, and finished up with a trim and extra space cleanup. This should be the fasted method. You can map
the array into a Regex and preserve word boundaries \b
, which have to be escaped when building it dynamically
const excludeWordList = ['the', 'in', 'a', 'an'];
const reg = new RegExp(excludeWordList.map(w => `\\b${w}\\b`).join('|'), 'g')
run("the wall into")
run("paintings covered the wall another words into this")
function run(speech) {
speech = speech.replaceAll(reg, '').trim().replace(/\s\s+/g, ' ');
console.log(speech);
}
Upvotes: 2
Reputation: 20451
You can try using split() and filter().
Edit: indexOf()
complexity is O(N) because of linear search. Since we have a fixed set of words we want to exclude, converting to a set is ideal.
new Set()
is also O(N), but since it is being done only once and your run()
will be called more often it makes sense here. With set, .has()
has O(1) complexity.
const excludeWordList = ['the', 'in', 'a', 'an'];
const excludeWordSet = new Set(excludeWordList);
run("the wall into")
run("paintings covered the wall another words into this")
function run(speech) {
speech = speech.split(' ').filter((a) => {
return (!excludeWordSet.has(a))
}).join(' ');
console.log(speech);
}
filter()
still has O(N) complexity. join()
has O(N). so this is still O(N^2), same as your initial attempt.
Upvotes: 1
Reputation: 3584
I'd use the a filter & set approach to minimize computation time (instead of includes
or indexOf
that iterate the whole array)
const excluded = new Set(['the', 'in', 'a', 'an']);
function run(speech) {
return speech.split(' ')
.filter(word => !excluded.has(word))
.join(' ');
}
run("the wall into")
run("paintings covered the wall another words into this")
Upvotes: 2
Reputation: 1409
const excludeWordList = ['the', 'in', 'a', 'an'];
run("the wall into")
run("paintings covered the wall another words into this")
function run(speech) {
const result = speech.split(' ').filter(word=>!excludeWordList.includes(word)).join(' ')
console.log(result);
}
Upvotes: 1