Leo Jiang
Leo Jiang

Reputation: 26095

Is it possible to change PHP functions' default parameters?

For example, trim() does not remove U+3000, the space character used in Chinese. It would be cumbersome to change every instance of trim() to include U+3000. Is it possible to modify trim()'s default parameter?

Also, PHP's regex's \s doesn't match U+3000 either. Is it possible to somehow make \s match U+3000?

Upvotes: 1

Views: 121

Answers (3)

Markus Malkusch
Markus Malkusch

Reputation: 7868

Unfortunatly trim() is not part of mbstring's function set (mb_*). Otherwise you could simply enable mbstring's Function Overloading Feature.

But thanks to PHP's namespace fallback policy it is possible:

For functions and constants, PHP will fall back to global functions or constants if a namespaced function or constant does not exist.

I.e. you can override trim()(not \trim()). You have to use namespaces and call trim without explicitly prefixing the global namespace (i.e. no \ prefix).

namespace myns;

function trim($str, $charlist="  ") {
    $pregCharacters = preg_quote($charlist);
    return preg_replace("/^[$pregCharacters]+|[$pregCharacters]+$/", '', $str);
}

var_dump(trim(" a b c "));

Didn't think too much about that RegExp. It should just illustrate overriding of trim().

AFAIK the only thing you have to take care of is that the definition of \myns\trim() should happen before your first trim() call. This is a very attractive technique for mocking time() in unit tests.


Regarding your second question, \s would match U+3000 if you turn on the u-switch (PCRE_UTF8):

var_dump(preg_match("/\s/u", " "));

Upvotes: 3

Amal Murali
Amal Murali

Reputation: 76656

No, it isn't possible to modify the internal workings of trim() function without modifying the C source code. However, you could create a new function, say customTrim() and then write code that removes all the characters you want removed. This will only be possible if you know beforehand what are the possible whitespace characters that would occur in these strings.

If you need to do this with preg_replace(), you can use the following:

$str = preg_replace('/^[\pZ\pC]+|[\pZ\pC]+$/u', '', $str);

The regex is from this blog entry. It will remove all whitespace characters (including the ones that \s matches), control characters. It will also remove the Unicode character 'IDEOGRAPHIC SPACE' (U+3000).

Test case:

$str = ' ';
$str = preg_replace('/^[\pZ\pC]+|[\pZ\pC]+$/u', '', $str);
var_dump($str, mb_strlen($str));

Output:

string(0) ""
int(0)

Upvotes: 0

Matthias W.
Matthias W.

Reputation: 1077

I think you cannot overload functions in PHP (but long time no PHP). Instead you could write your own function first calling trim if necessary. Afterwards take a look at the str_replace() function; you might be able to "replace" the Chinese Unicode space character by "an empty character" (i.e. ''). How to write that in your code seems to depend on your character encoding, see also Replace unicode character

Upvotes: -2

Related Questions