Reputation: 1569
I am trying to count the number of words in a given string using the following code:
var t = document.getElementById('MSO_ContentTable').textContent;
if (t == undefined) {
var total = document.getElementById('MSO_ContentTable').innerText;
} else {
var total = document.getElementById('MSO_ContentTable').textContent;
}
countTotal = cword(total);
function cword(w) {
var count = 0;
var words = w.split(" ");
for (i = 0; i < words.length; i++) {
// inner loop -- do the count
if (words[i] != "") {
count += 1;
}
}
return (count);
}
In that code I am getting data from a div tag and sending it to the cword()
function for counting. Though the return value is different in IE and Firefox. Is there any change required in the regular expression? One thing that I show that both browser send same string there is a problem inside the cword()
function.
Upvotes: 13
Views: 37355
Reputation: 122908
[EDIT] This is a very old answer. Updated. The initial answer/code snippet can be found at the bottom of this answer.
Nowadays, one would not extend the prototype of native Objects (like String
). A way to extend the protype of native Objects without the danger of naming conflicts is to use the es20xx Symbol
.
The following snippet is an example of 'symbolically' extending String.prototype
(see MDN, see also).
Or check this small Stackblitz project.
const nWords = Symbol(`countWords`);
const letterCount = Symbol(`letterCount`);
// extend String.prototype with symbols
Object.defineProperty(
String.prototype,
nWords, {
get() { return this.match(/\w+/g).length; }
} );
Object.defineProperty(
String.prototype,
letterCount, {
get() {
return function(letter, caseSensitive = false) {
const mods = `g${caseSensitive ? "" : "i"}`;
return this.match(RegExp(letter, mods))?.length ?? 0;
}
}
} );
const testString = document.querySelector(`code`).textContent;
console.log(`[testString] has ${testString[nWords]} words, contains ${
testString[letterCount](`D`, true)} "D" and ${
testString[letterCount](`d`)} "d or D"`);
<h3>testString</h3>
<code>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</code>
Initial answer: you can use split
and add a wordcounter to the String
prototype:
if (!String.prototype.countWords) {
String.prototype.countWords = function() {
return this.length && this.split(/\s+\b/).length || 0;
};
}
console.log(`'this string has five words'.countWords() => ${
'this string has five words'.countWords()}`);
console.log(`'this string has five words ... and counting'.countWords() => ${
'this string has five words ... and counting'.countWords()}`);
console.log(`''.countWords() => ${''.countWords()}`);
Upvotes: 22
Reputation: 1796
This is the best solution I've found:
function wordCount(str) {
var m = str.match(/[^\s]+/g)
return m ? m.length : 0;
}
This inverts whitespace selection, which is better than \w+
because it only matches the latin alphabet and _ (see http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6)
If you're not careful with whitespace matching you'll count empty strings, strings with leading and trailing whitespace, and all whitespace strings as matches while this solution handles strings like ' '
, ' a\t\t!\r\n#$%() d '
correctly (if you define 'correct' as 0 and 4).
Upvotes: 11
Reputation: 21765
//Count words in a string or what appears as words :-)
function countWordsString(string){
var counter = 1;
// Change multiple spaces for one space
string=string.replace(/[\s]+/gim, ' ');
// Lets loop through the string and count the words
string.replace(/(\s+)/g, function (a) {
// For each word found increase the counter value by 1
counter++;
});
return counter;
}
var numberWords = countWordsString(string);
Upvotes: 0
Reputation: 172
I would prefer a RegEx only solution:
var str = "your long string with many words.";
var wordCount = str.match(/(\w+)/g).length;
alert(wordCount); //6
The regex is
\w+ between one and unlimited word characters
/g greedy - don't stop after the first match
The brackets create a group around every match. So the length of all matched groups should match the word count.
Upvotes: 14
Reputation: 43810
You can make a clever use of the replace() method although you are not replacing anything.
var str = "the very long text you have...";
var counter = 0;
// lets loop through the string and count the words
str.replace(/(\b+)/g,function (a) {
// for each word found increase the counter value by 1
counter++;
})
alert(counter);
the regex can be improved to exclude html tags for example
Upvotes: 3