Greg Nisbet
Greg Nisbet

Reputation: 6994

How does Perl avoid shebang loops?

perl interprets the shebang itself and mimics the behavior of exec*(2). I think it emulates the Linux behavior of splitting on all whitespace instead of BSD first-whitespace-only thing, but never mind that.

Just as a quick demonstration really_python.pl

#!/usr/bin/env python

# the following line is correct Python but not correct Perl
from collections import namedtuple
print "hi"

prints hi when invoked as perl really_python.pl.

Also, the following programs will do the right thing regardless of whether they are invoked as perl program or ./program.

#!/usr/bin/perl
print "hi\n";

and

#!/usr/bin/env perl
print "hi\n";

I don't understand why the program isn't infinite looping. In either of the above cases, the shebang line either is or resolves to an absolute path to the perl interpreter. It seems like the next thing that should happen after that is perl parses the file, notices the shebang, and delegates to the shebang path (in this case itself). Does perl compare the shebang path to its own ARGV[0]? Does perl look at the shebang string and see if it contains "perl" as a substring?

I tried to use a symlink to trigger the infinite loop behavior I was expecting.

$ ln -s /usr/bin/perl /tmp/p

#!/tmp/p
print "hi\n";

but that program printed "hi" regardless of how it was invoked.

On OS X, however, I was able to trick perl into an infinite shebang loop with a script.

Contents of /tmp/pscript

#!/bin/sh
perl "$@"

Contents of perl script

#!/tmp/pscript
print "hi\n";

and this does infinite loop (on OS X, haven't tested it on Linux yet).

perl is clearly going to a lot of trouble to handle shebangs correctly in reasonable situations. It isn't confused by symlinks and isn't confused by normal env stuff. What exactly is it doing?

Upvotes: 11

Views: 678

Answers (2)

ikegami
ikegami

Reputation: 385849

The documentation for this feature is found in perlrun.

If the #! line does not contain the word "perl" nor the word "indir", the program named after the #! is executed instead of the Perl interpreter. This is slightly bizarre, but it helps people on machines that don't do #!, because they can tell a program that their SHELL is /usr/bin/perl, and Perl will then dispatch the program to the correct interpreter for them.

So, if the shebang contains perl or indir, the interpreter from the shebang line isn't executed.

Additionally, the interpreter from the shebang line isn't executed if argv[0] doesn't contain perl. This is what prevents the infinite loop in your example.

  • When launched using perl /tmp/pscript,

    1. the kernel executes perl /tmp/pscript,
    2. then perl executes /tmp/p /tmp/pscript.
    3. At this point, argv[0] doesn't contain perl, so the shebang line is no longer relevant.
  • When launched using /tmp/pscript,

    1. the kernel executes /tmp/p /tmp/pscript.
    2. At this point, argv[0] doesn't contain perl, so the shebang line is no longer relevant.

Upvotes: 12

ThisSuitIsBlackNot
ThisSuitIsBlackNot

Reputation: 24063

The relevant code is in toke.c, the Perl lexer. If:

  • line 1 begins with #! (optionally preceded by whitespace) AND

  • does not contain perl - AND

  • does not contain perl (unless followed by a 6, i.e. perl6) AND

  • (on "DOSish" platforms) does not contain a case-insensitive match of perl (e.g. Perl) AND

  • does not contain indir AND

  • the -c flag was not set on the command line AND

  • argv[0] contains perl

the program following the shebang is executed with execv. Otherwise, the lexer just keeps going; perl doesn't exec itself.

As a result, you can do some pretty weird things with the shebang without perl trying to exec another interpreter:

    #!     perl
#!foo perl
#!fooperlbar -p
#!perl 6
#!PeRl          # on Windows

Your symlink example meets all of the conditions listed above, so why isn't there an infinite loop? You can see what's going on with strace:

$ ln -s /usr/bin/perl foo
$ echo '#!foo' > bar
$ strace perl bar 2>&1 | grep exec
execve("/bin/perl", ["perl", "bar"], [/* 27 vars */]) = 0
execve("foo", ["foo", "bar"], [/* 27 vars */]) = 0

Perl actually does exec the link, but because it doesn't contain perl in the name, the last condition is no longer met the second time around and the loop ends.

Upvotes: 13

Related Questions