Reputation: 1730
If I have two strings like those
s1 = "This is a foo bar sentence ."
s2 = "This sentence is similar to a foo bar sentence ."
And I want to split the string to be in this format
x1 = ["This":1,"is":1,"a":1,"bar":1,"sentence":1,"foo":1]
x2 = ["This":1,"is":1,"a":1,"bar":1,"sentence":2,"similar":1,"to":1,"foo":1]
It split the string words and count them, to a pair of where each string represent a word and the number represent the count of this word in the string.
Upvotes: 1
Views: 7752
Reputation: 338158
Remove punctuation, normalize whitespace, lowercase, split at the space, use a loop to count word occurrences into an index object.
function countWords(sentence) {
var index = {},
words = sentence
.replace(/[.,?!;()"'-]/g, " ")
.replace(/\s+/g, " ")
.toLowerCase()
.split(" ");
words.forEach(function (word) {
if (!(index.hasOwnProperty(word))) {
index[word] = 0;
}
index[word]++;
});
return index;
}
Or, in ES6 arrow-function style:
const countWords = sentence => sentence
.replace(/[.,?!;()"'-]/g, " ")
.replace(/\s+/g, " ")
.toLowerCase()
.split(" ")
.reduce((index, word) => {
if (!(index.hasOwnProperty(word))) index[word] = 0;
index[word]++;
return index;
}, {});
Upvotes: 8