Reputation: 1
Perl has a notion of an undefined function. A function that's declared but not defined.
sub foo;
foo(); # Undefined subroutine error
bar(); # Undefined subroutine error
This function foo
exists now in the symbol table and it can be used to resolve a method call. But why does this "feature" even exist? In C, it's because functions are type-checked and sometimes you want to have a call before you define (such as to resolve a circular dependency). But Perl has no such feature and all function symbols are resolved in runtime not compile time.
If it's for prototypes, then why should an undefined function exist if it does not have a prototype?
If it's not for prototypes, why does it exist at all?
And why is that undefined subroutines are used in method resolution? Why not ignore them entirely -- you can't call them and they're internal implementation details as far as I can see (at best)? That is why if function isn't defined can't we continue method resolution as if it did not exist (it seems like it would be less confusing).
Upvotes: 22
Views: 2715
Reputation: 80384
The reason why "Perl has a notion of an undefined function" is because it's a one-pass compiler. All else follows from this simple principle. That's why:
printf "it is '%s'\n", some_function();
is a syntactically legal statement as far as the compiler is concerned. This is easily verified via perl -c
to just compile but not run the code:
% perl -ce 'printf "it is '%s'\n", some_function()'
-e syntax OK
Sure, if you tried to run that, the interpreter will die because you tried to call an undefined subroutine, but that's not really the compiler's business. For further insight, you should examine the compiler's resulting parse tree using the B::Concise module:
% perl -MO=Concise,-exec -e 'printf "it is '%s'\n", some_function()'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark sM
4 <$> const(PV "it is %s\n") sM
5 <0> pushmark s
6 <$> gv(*some_function) s/EARLYCV
7 <1> entersub[t2] lKMS/LVINTRO,TARG,INARGS
8 <@> prtf vK
9 <@> leave[1 ref] vKP/REFC
-e syntax OK
Look specifically at opcode 6: gv(*some_function) s/EARLYCV
. That's telling you this was a coderef that was used before the compiler saw a definition for it.
The very same parse tree is obtained by placing the subroutine definition after the code that calls it:
% perl -MO=Concise,-exec -e 'printf "it is '%s'\n", some_function(); sub some_function { time }'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark sM
4 <$> const(PV "it is %s\n") sM
5 <0> pushmark s
6 <$> gv(*some_function) s/EARLYCV
7 <1> entersub[t2] lKMS/LVINTRO,TARG,INARGS
8 <@> prtf vK
9 <@> leave[1 ref] vKP/REFC
-e syntax OK
This is quite different when the compiler already knows what coderef that name is bound to at compile time, which you can effect by placing the definition before the code that calls it:
% perl -MO=Concise,-exec -e 'sub some_function { time } printf "it is '%s'\n", some_function()'
1 <0> enter
2 <;> nextstate(main 3 -e:1) v:{
3 <0> pushmark sM
4 <$> const(PV "it is %s\n") sM
5 <0> pushmark s
6 <$> gv(IV \&main::some_function) s
7 <1> entersub lKMS/LVINTRO,INARGS
8 <@> prtf vK
9 <@> leave[1 ref] vKP/REFC
-e syntax OK
Now look what has happened to opcode 6! It has become gv(IV \&main::some_function) s
. Now the interpreter won't have to look that coderef up at runtime. The compiler has already provided it.
If you declare the function before the compiler sees you use it, it still can't know what coderef that resolves to until runtime.
% perl -MO=Concise,-exec -e 'sub some_function; printf "it is '%s'\n", some_function(); sub some_function { time }'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark sM
4 <$> const(PV "it is %s\n") sM
5 <0> pushmark s
6 <$> gv(*some_function) s
7 <1> entersub[t2] lKMS/LVINTRO,TARG,INARGS
8 <@> prtf vK
9 <@> leave[1 ref] vKP/REFC
-e syntax OK
Now opcode 6 reads gv(*some_function) s
, because the interpreter still has to look it up in the package symbol table to find the coderef. The compiler wasn't able to provide the coderef's address to the interpreter.
You might find this surprising, given that you yourself can clearly see the function definition yourself later on. But the compiler cannot.
Why not?
It's what I said at the beginning: because Perl is a one-pass compiler, that's why. That's the answer to your question.
All discussions about function prototypes, AUTOLOAD
intercepts, and method resolution are distractions that get lost in the weeds. They describe several interesting ramifications that follow naturally from this initial principle. While these are all perfectly valid — and valuable — observations, they ultimately fail to answer your question because they do not identify the unitary cause behind it all: the single-pass nature of the Perl compiler.
Upvotes: 6
Reputation: 239781
It is about prototypes, and the declaration (or not) of a function does have an effect at compile time. Consider
print foo + 42;
In isolation, this is equivalent to print('foo' + 42);
— foo
is a "bareword". If you have strict 'subs'
enabled, it will instead give you a compilation error saying that barewords are forbidden.
sub foo;
print foo + 42;
This is equivalent to print(foo(42))
; the compiler knows that foo
is a sub and it has no prototype, so it consumes everything after it in "list op" fashion, and what follows it is the term +42
.
sub foo();
print foo + 42;
This is equivalent to print(foo() + 42)
; the compiler knows that foo
has a prototype and that it takes no arguments, so none will be looked for, and foo
and 42
will be the operands of the +
operator.
sub foo($);
print foo + 42;
Like case 2 this is equivalent to print(foo(42))
. I think there's probably a test I could have used to distinguish them.
Point being, whether a sub is known or not does have effects at compile-time, and Perl gives you the option to declare that fact before you define the body of the sub, rarely needed as it may be.
As for why it has an impact on method resolution order — most likely it's a side-effect, but it's not wrong. Forward-declaring a sub is supposed to mean that you intend to intend to provide the definition before compilation is done. If you don't, then you will get a runtime error when you try to call it. It seems fair enough to me that if such a declaration is in a package in the MRO, then that means "there should be a method here, but I forgot", and you get an error when the MRO reaches that package.
Upvotes: 31
Reputation: 66873
It's the same as with other types, I'd say; just the artifact of how it is parsed.
my $hr = { a => 1 }; # $hr name introduced at compile time, assigned or not
So the same goes with
sub name { ... }; # "name" "declared" at compile time
and saying just sub name;
is about the same as saying my $hr;
-- and then having a symbol with no definition attached to it.
I don't know how the parser works but I'd guess that it has to take sub name
first and "bind" the definition later, so by happenstance we can then also say just sub name;
and have that name.
I mean to say that this is the "reason", per the question "But why does this "feature" even exist?"
But then once it is known to the compiler ahead of time that there is a sub with that name then there may be various uses of that fact.
Upvotes: 5
Reputation: 117
This has saved me a lot of debugging:
sub AUTOLOAD
{
my (undef,$filename,$lineno) = caller;
my ($fn) = basename($filename);
logmsg('E',"Undefined reference ($fn/$lineno): ref=$AUTOLOAD, refer=$ENV{HTTP_REFERER}");
}
Upvotes: 3
Reputation: 385645
I can think of these reasons:
Allows placing the definition of a sub after a call to it if they have a prototype or attributes.
Notable specifics:
Allows AUTOLOADed and similar subs to be declared.
Notable specifics:
Can be used as an abstract method.
Upvotes: 2