Reputation: 111899
In the PHP manual on variables, we can read:
Variable names follow the same rules as other labels in PHP. A valid variable name starts with a letter or underscore, followed by any number of letters, numbers, or underscores. As a regular expression, it would be expressed thus: '[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*'
So obviously when we try to run:
$0-a = 5;
echo $0-a;
we will get Parse error. This is quite obvious.
However when trying some things, what I found is that actually variables can contain any characters (or at least start with numbers and contain hyphens) when using such syntax:
${'0-a'} = 5;
echo ${'0-a'};
it works without any problems.
Also using variable variables like this:
$variable = '0-a';
$$variable = 5;
echo $$variable;
works without any problem.
So the question is - is that sentence I quote in manual is not true or maybe this what I showed is not real variable or maybe it's documented somewhere else in PHP manual?
I've verified it - and it seems to work both in PHP 5.6 and 7.1
Also the question is - is it safe to use such constructions? Based on manual it seems it shouldn't be possible at all.
Upvotes: 15
Views: 3448
Reputation: 31
I've seen on php.net (spanish version) things like $función, $año * and so, always wanted to try, but never did. However, for some reason I wrote some vars like $1a ($1st), $2a, $3a... And it worked, no '${1a}' or stuff needed, but, Phpstorm is nagging me with warnings like "Expected: semicolon" and "Expression is not assignable: Constant reference". So, after reading your experiences in this article, and to clean my editor from warnings and avoid possible future problems, I just changed everything to $a, $b, $c, etc. Note: $ano (year) in spanish is anus, so nobody likes to use it
Upvotes: 1
Reputation: 39494
You can literally choose any name for a variable. "i"
and "foo"
are obvious choices, but ""
, "\n"
, and "foo.bar"
are also valid. The reason? The PHP symbol table is just a dictionary: a string key of zero or more bytes maps to a structured value (called a zval). Interestingly, there are two ways to access this symbol table: lexical variables and dynamic variables.
Lexical variables are what you read about in the "variables" documentation. Lexical variables define the symbol table key during compilation (ie, while the engine is lexing and parsing the code). To keep this lexer simple, lexical variables start with a $
sigil and must match the regex [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
. Keeping it simple this way means the parser doesn't have to figure out, for example, whether $foo.bar
is a variable keyed by "foo.bar"
or a variable "foo"
string concatenated with a constant bar
.
Now dynamic variables is where it gets interesting. Dynamic variables let you access those more uncommon variable names. PHP calls these variable variables. (I'm not fond of that name, as their opposite is logically "constant variable", which is confusing. But I'll call them variable variables here on.) The basic usage goes like:
$a = 'b';
$b = 'SURPRISE!';
var_dump($$a, ${$a}); // both emit a surprise
Variable variables are parsed differently than lexical variables. Rather than defining the symbol table key at lexing time, the symbol table key is evaluated at run time. The logic goes like this: the PHP lexer sees the variable variable syntax (either $$a
or more generally ${expression}
), the PHP parser defers evaluation of the expression until at run-time, then at run-time the engine uses the result of the expression to key into the symbol table. It's a little more work than lexical variables, but far more powerful.
Inside of ${}
you can have an expression that evaluates to any byte sequence. Empty string, null byte, all of it. Anything goes. That is handy, for example, in heredocs. It's also handy for accessing remote variables as PHP variables. For example, JSON allows any character in a key name, and you might want to access those as straight variables (rather than array elements):
$decoded = json_decode('{ "foo.bar" : 1 }');
foreach ($decoded as $key => $value) {
${$key} = $value;
}
var_dump(${'foo.bar'});
Using variable variables in this way is similar to using an array as a "symbol table", like $array['foo.bar']
, but the variable variable approach is perfectly acceptable and slightly faster.
Addendum
By "slightly faster" we are talking so far to the right of the decimal point that they're practically indistinguishable. It's not until 10^8 symbol accesses that the difference exceeds 1 second in my tests.
Set array key: 0.000000119529
Set var-var: 0.000000101196
Increment array key: 0.000000159856
Increment var-var: 0.000000136778
The loss of clarity and convention is likely not worth it.
$N = 100000000;
$elapsed = -microtime(true);
$syms = [];
for ($i = 0; $i < $N; $i++) { $syms['foo.bar'] = 1; }
printf("Set array key: %.12f\n", ($elapsed + microtime(true)) / $N);
$elapsed = -microtime(true);
for ($i = 0; $i < $N; $i++) { ${'foo.bar'} = 1; }
printf("Set var-var: %.12f\n", ($elapsed + microtime(true)) / $N);
$elapsed = -microtime(true);
$syms['foo.bar'] = 1;
for ($i = 0; $i < $N; $i++) { $syms['foo.bar']++; }
printf("Increment array key: %.12f\n", ($elapsed + microtime(true)) / $N);
$elapsed = -microtime(true);
${'foo.bar'} = 1;
for ($i = 0; $i < $N; $i++) { ${'foo.bar'}++; }
printf("Increment var-var: %.12f\n", ($elapsed + microtime(true)) / $N);
Upvotes: 16