Reputation: 121
Assuming a Perl script that allows users to specify several text filter expressions in a config file, is there a safe way to let them enter regular expressions as well, without the possibility of unintended side effects or code execution? Without actually parsing the regexes and checking them for problematic constructs, that is. There won't be any substitution, only matching.
As an aside, is there a way to test if the specified regex is valid before actually using it? I'd like to issue warnings if something like /foo (bar/
was entered.
Thanks, Z.
use re 'eval'
pragma is used:
(?{code})
(??{code})
${code}
@{code}
The default is no re 'eval'
; so unless I'm missing something, it should be safe to read regular expressions from a file, with the only check being the eval/catch posted by Axeman. At least I haven't been able to hide anything evil in them in my tests.
Thanks again. Z.
Upvotes: 12
Views: 1247
Reputation: 14184
Would the Safe module be of any use with regard to compiling/executing untrusted regular expressions?
Upvotes: 0
Reputation: 29854
This
eval {
qr/$re/;
};
if ( $@ ) {
# do something
}
compiles an expression, and lets you recover from an error.
You can watch for malicious expression, since you're only going to do matching, by looking for these patterns, which would allow arbitrary code to be run:
(?: \( \?{1,2} \{ # '(' followed by '?' or '??', and then '{'
| \@ \{ \s* \[ # a dereference of a literal array, which may be arbitrary code.
)
Make sure you compile this with the x
flag.
Upvotes: 11
Reputation: 5082
I would suggest not trusting any regular expressions from users. If you are actually determined to do so, please run perl in taint (-T) mode. In that case, you'll need some form of validation. Instead of using Parse::RecDescent for writing your own regular expression parser as another answer suggests, you should use the existing YAPE::Regex regexp parser which is probably faster, was written by an expert and works like a charm.
Finally, since perl 5.10.0, you can plug different regular expression engines into perl (lexically!). You could check whether there's a less powerful regular expression engine available whose syntax is more easily verifiable. If you want to go down that route, read the API description, Avar's re::engine::Plugin, or in general check out any of Avar's plugin engines.
Upvotes: 5
Reputation: 1200
Depending on what you're matching against, and the version of Perl you're running, there might be some regexes that act as an effective denial of service attack by using excessive lookaheads, lookbehinds, and other assertions.
You're best off allowing only a small, well-known subset of regex patterns, and expanding it cautiously as you and your users learn how to use the system. In the same way that many blog commenting systems allow only a small subset of HTML tags.
Eventually Parse::RecDescent might become useful, if you need to do complex analysis of regexes.
Upvotes: 13
Reputation: 994231
You will probably have to do some level of sanitisation. For example, the perlre man page describes the following construct:
(?{ code })
which allows executable code inside a pattern match.
Upvotes: 5