Reputation: 4727
Ive an array like this
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM", "AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO", "DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS", "IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT", "THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS", "WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"];
So the Im trying to make a function or any quick way to remove the occurances of the above words from a sentence. Not using any looping how can I quickly achieve that.
They way Im doing it now
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM", "AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO", "DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS", "IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT", "THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS", "WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"];
var sentence = "The first solution does not work for any UTF-8 alphaben. (It will cut text such as Привіт). I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.";
$(excludeWords).each(function(index, item) {
var s = new RegExp(item, "gi");
sentence = sentence.replace(s, "");
});
alert(sentence);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
But is there is any better solution than looping??
Based on a comment little more details..
It never should remove part of a word. it should only replace a full word
Upvotes: 1
Views: 678
Reputation: 2185
will be better if we split on the basis of word boundaries.
sentence = sentence.split(/\b/).reduce((str, word) => {
return new Set(excludeWords).has(word)
? str + word.replace(/./g, '')
: str + word;
}, '').replace(/\s\s+/,' ').trim();
Upvotes: 1
Reputation: 386610
You could add a word boundary \b
for getting only words to replace.
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM", "AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO", "DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS", "IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT", "THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS", "WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"],
sentence = "The first solution does not work for any UTF-8 alphaben. (It will cut text such as Привіт). I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.";
sentence = excludeWords.reduce(function(r, s) {
return r.replace(new RegExp('\\b' + s + '\\b', "gi"), "");
}, sentence);
console.log(sentence);
Upvotes: 1
Reputation: 318212
You'd split on space, and just check if the word is in the array in a filter
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM", "AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO", "DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS", "IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT", "THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS", "WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"];
var sentence = "The first solution does not work for any UTF-8 alphaben. (It will cut text such as Привіт). I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.";
var res = sentence.split(" ").filter(w=>!excludeWords.includes(w.toUpperCase())).join(" ");
console.log(res)
If you simply replace strings with a regex, you'll have some issues, for instance solution
ends up being luti
as both so
and on
are in the array, so you need to compare complete words instead
Upvotes: 1
Reputation: 6565
You can make one string of the array values and then apply regex on it and again convert it to array.
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM", "AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO", "DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS", "IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT", "THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS", "WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"];
var array_to_string = excludeWords.join(' ');
var s = new RegExp(array_to_string, "gi");
var sentence = sentence.replace(s, "");
var excludewords_updated = sentence.split(' ');
so this is how you can do it without looping.
Upvotes: 1
Reputation: 214969
You're almost there. The trick is to combine all words into one big regexp to do the replacement just once. \\b
's ensure that you actually replacing whole words and not just substrings.
var excludeWords = ["A", "ABOUT", "ABOVE", "ACROSS", "ALL", "ALONG", "AM", "AN", "AND", "ANY", "ASK", "AT", "AWAY", "CAN", "DID", "DIDN'T", "DO", "DON'T", "FOR", "FROM", "HAD", "HAS", "HER", "HIS", "IN", "INTO", "IS", "IT", "NONE", "NOT", "OF", "ON", "One", "OUT", "SO", "SOME", "THAT", "THE", "THEIR", "THERE", "THEY", "THESE", "THIS", "TO", "TWIT", "WAS", "WERE", "WEREN'T", "WHICH", "WILL", "WITH", "WHAT", "WHEN", "WHY"];
var sentence = "The first solution does not work for any UTF-8 alphaben. (It will cut text such as Привіт). I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.";
var re = new RegExp(`\\b(${excludeWords.join('|')})\\b`, 'gi');
sentence = sentence.replace(re, "");
console.log(sentence);
Note that this eventually creates consecutive spaces in the string. These can be easily removed with replace(/\s+/g, ' ').trim()
.
Upvotes: 4