Is the Perl Parser safe to parse arbitrary code?

Question

I've had a hard time finding an authoritative answer to this. Is the Perl 5 Parser safe to parse arbitrary code? Ignoring the implications of running or executing the code, is the parser itself consider safe?

Would a vulnerability in the parser that allows arbitrary code execution be consider a security vulnerability, or a bug?

It seems that there are tickets that are rejected from the security lists and pushed to general on parser-specific bugs,

For example, a "Keeper of the Pumpkin" Tony Cook says,

I think in general we consider such issues as not a security concern, after all, if an attacker can provide code to the parser, you're already vulnerable.

Specifically I'm trying to clarify these two threat models,

Safe assumes that Perl was able to generate a blacklist/whitelist opcode-tree correctly
Perl (p5p) assumes that all parsing is done with the full capabilities of arbitrary code execution and that if malicious code was provided you're "already vulnerable"

Questions

Would be interesting to know why you ask. Info on that could get you better answers.

If the perl parser is unable to parse code safely then I believe nothing can secure arbitrary Perl. So the whole purpose of Safe.pm and OpCode.pm is moot because you can't safely generate an opcode tree to prune (blacklist opcodes). If so, that's not clear anywhere in the docs, and keeping that kind of thing secretive is a very poor idea. It would be precarious to namedrop and open up software to exploitation.

amon · Accepted Answer

You should not feed untrusted code to perl. The Perl security model is centred about data (e.g. through the taint flag), and not about providing a sandbox for code. I.e. you should be able to write secure programs, but not necessarily be able to run arbitrary programs securely. So if someone is able to execute code, you should assume they are indeed able to execute any code.

One aspect of this is Perl's multi-phase execution model. BEGIN blocks are executed as soon as they are parsed. This is necessary in order to parse subsequent code correctly. The use statements are also BEGIN blocks. So because code must be executed during parsing, it is incorrect to think that first the Perl code is parsed and then it is executed. Consider this example:

say 4;
BEGIN {
  say 2;
  BEGIN { say 1 }
  say 3;
}
say 5;

Each BEGIN block is executed as soon as parsing it has completed, resulting in the numbered execution order. Importantly, some code is executed before parsing the whole document has completed.

While parsing, the perl compiler creates an optree. This optree is both an abstract syntax tree for the parser, and the opcodes for the Perl interpreter.

Perl does offer the ability to filter ops through the Opcode module, which should only be used through the frontends ops or Safe. These can set up a mask which controls which opcodes can be created while the mask is in effect. Thus, if we disallow loop opcodes but a loop is parsed, the parsing will fail at that point. Importantly the mask does not filter opcodes between compilation and execution (those two are not clearly delineated phases!). Instead, it prevents those opcodes from ever existing.

There are a number of huge caveats with this.

Parsing is error prone. If there is a vulnerability in the parsing code, that could be exploited by specially crafted input.
The Safe module tries to whitelist or blacklist allowed opcodes. However, it is not necessarily clear that a certain set of allowed opcodes is safe. Perl was not designed with sandboxing in mind.

Also, a lot of functionality is not provided by opcodes but by subroutines, which may have been compiled before the opcode mask was in place. Their functionality or vulnerabilities in their implementation could be used to escape a sandbox. In particular, I expect that many XS modules could be used here because they often expose pointers to Perl code.

If you must execute untrusted Perl code, it may be best to use operating-system level security features, possibly in addition to the Perl security features. Linux containerization tech comes to mind. If you perform any parsing + execution in a sandboxed process, this would probably be sufficiently secure for untrusted code – iff the sandbox is configured properly. E.g. a seccomp rulechain can be built to allow/deny certain syscalls. Cgroups can set fine-grained resource limits.

Is the Perl Parser safe to parse arbitrary code?

Answers (2)

Related Questions