Reputation: 1042
I am having a hard time figuring out what is happening with this regexp to match multiple whitespace :
$str = ' ';
if (preg_match_all('/\s{2,}/', $str, $matches)) {
var_dump($matches);
}
The fact is, if i replace str value with 3 "real" spaces, it works as expected, but obviously the characters in str are not whitespaces (copy paste from other source) !! But i need to match them to replace them with real spaces/whatever.
My question: What are those simple space looking characters in str and more important, how do i target them in a regexp ?
Upvotes: 0
Views: 346
Reputation: 11
The whitespace characters captured by \s may include real space (code 0x20) horizontal tab character (0x09), carriage return (0x0D), line feed (0x0A) and form feed (0x0C). So if you want to turn all these characters to real spaces, you may use this line:
$str=preg_replace('/\s/',' ',$str);
Or, if you want to replace a sequence of two or more whitespace characters with just a single real space, use this instead:
$str=preg_replace('/\s{2,}/',' ',$str);
Upvotes: 0
Reputation: 26375
The middle character is a utf-8 encoded non-breaking space. Add the utf-8 modifier u
to your regex and it'll work just fine, e.g. /\s{2,}/u
.
Outputs:
array(1) {
[0]=>
array(1) {
[0]=>
string(4) " "
}
}
Upvotes: 2