Reputation: 4404
What is the best way to check if a single character is a whitespace?
I know how to check this through a regex.
But I am not sure if this is the best way if I only have a single character.
Isn't there a better way (concerning performance) for checking if it's a whitespace?
If I do something like this. I would miss white spaces like tabs I guess?
if (ch == ' ') {
...
}
Upvotes: 52
Views: 66555
Reputation: 2217
Following code removes the whitespace from a string in JavaScript ✅
var string = "white space";
function removeWhiteSpaceFromString(str) {
console.log("Before removing the whitespace: " + str);
var chars = str.split("");
var text = "";
for (ch of chars)
if (ch != " ") {
text += ch;
}
}
console.log(" After removing the whitespace: " + text);
}
removeWhiteSpaceFromString(string);
Upvotes: 0
Reputation: 13457
Based on this benchmark, it appears the following method would be most performant:
For Performance:
function isWhitespace(c) {
return c === ' '
|| c === '\n'
|| c === '\t'
|| c === '\r'
|| c === '\f'
|| c === '\v'
|| c === '\u00a0'
|| c === '\u1680'
|| c === '\u2000'
|| c === '\u200a'
|| c === '\u2028'
|| c === '\u2029'
|| c === '\u202f'
|| c === '\u205f'
|| c === '\u3000'
|| c === '\ufeff'
}
There are, no doubt, some cases were you might want this level of performance (I'm working on a markdown converter and am trying to squeeze out as much performance as possible). However, in most cases, this level of optimization is unnecessary. In such cases, I would recommend something like this:
For Simplicity:
const whitespaceRe = /\s/
function isWhitespace(c) {
return whitespaceRe.test(c)
}
This is more readable, and less likely to have a typo and, therefore, less likely to have a bug.
Upvotes: 4
Reputation: 195
function hasWhiteSpace(s) {
return /\s/g.test(s);
}
This will work
or you can also use this indexOf():
function hasWhiteSpace(s) {
return s.indexOf(' ') >= 0;
}
Upvotes: 0
Reputation: 31
@jake 's answer above -- using the trim()
method -- is the best option. If you have a single character ch as a hex number:
String.fromCharCode(ch).trim() === ""
will return true for all whitespace characters.
Unfortunately, comparison like <=32
will not catch all whitespace characters. For example; 0xA0
(non-breaking space) is treated as whitespace in Javascript and yet it is > 32. Searching using indexOf()
with a string like "\t\n\r\v"
will be incorrect for the same reason.
Here's a short JS snippet that illustrates this: https://repl.it/@saleemsiddiqui/JavascriptStringTrim
Upvotes: 3
Reputation: 5630
While it's not entirely correct, I use this pragmatic and fast solution:
if (ch.charCodeAt(0) <= 32) {...
Upvotes: 6
Reputation: 1939
The regex approach is a solid way to go. But here's what I do when I'm lazy and forget the proper regex syntax:
str.trim() === '' ? alert('just whitespace') : alert('not whitespace');
Upvotes: 30
Reputation: 73
how about this one : ((1L << ch) & ((ch - 64) >> 31) & 0x100002600L) != 0L
Upvotes: -3
Reputation:
var testWhite = (x) {
var white = new RegExp(/^\s$/);
return white.test(x.charAt(0));
};
This small function will allow you to enter a string of variable length as an argument and it will report "true" if the first character is white space or "false" otherwise. You can easily put any character from a string into the function using the indexOf or charAt methods. Examples:
var str = "Today I wish I were not in Afghanistan.";
testWhite(str.charAt(9)); // This would test character "i" and would return false.
testWhite(str.charAt(str.indexOf("I") + 1)); // This would return true.
Upvotes: 1
Reputation: 169593
If you only want to test for certain whitespace characters, do so manually, otherwise, use a regular expression, ie
/\s/.test(ch)
Keep in mind that different browsers match different characters, eg in Firefox, \s
is equivalent to (source)
[ \f\n\r\t\v\u00A0\u2028\u2029]
whereas in Internet Explorer, it should be (source)
[ \f\n\r\t\v]
The MSDN page actually forgot the space ;)
Upvotes: 52
Reputation: 16455
I have referenced the set of whitespace characters matched by PHP's trim function without shame (minus the null byte, I have no idea how well browsers will handle that).
if (' \t\n\r\v'.indexOf(ch) > -1) {
// ...
}
This looks like premature optimization to me though.
Upvotes: 13
Reputation: 132227
this covers spaces, tabs and newlines:
if ((ch == ' ') || (ch == '\t') || (ch == '\n'))
this should be best for performance. put the whitespace character you expect to be most likely, first.
if performance is really important, probably best to consider the bigger picture than individual operations like this...
Upvotes: 8