Reputation: 40152
In the English
module and a few other places, users are advised never to use the $&
$`
and $'
variables or their English
equivalents $MATCH $PREMATCH $POSTMATCH
due to the fact that they will slow down all regex use.
What is a good test case (benchmark) that shows the performance problems?
Upvotes: 2
Views: 306
Reputation: 30841
Here's a simple starting point, looking for a single character in strings of varying lengths. The match variables make copies of the source string so I expected the penalty to be proportional to the amount of copying required. Reality seems to be the opposite. (This is why we benchmark, children.) The cost of matching against a longer string outweighs the overhead of making a copy. In retrospect, that makes sense, as the copy is just a memcpy
while the regex engine has to scan character-by-character.
use 5.010;
use strict;
use warnings;
use Benchmark qw(cmpthese);
for my $n (map { 10 ** $_ } 0 .. 4) {
my $string = 'a' x $n . 0 . 'a' x $n;
print "N = $n:\n";
cmpthese(1000000, {
'w/ match vars' => sub { $string =~ /\d/p },
'w/o match vars' => sub { $string =~ /\d/ },
});
print "\n";
}
Results:
N = 1:
(warning: too few iterations for a reliable count)
Rate w/ match vars w/o match vars
w/ match vars 1184834/s -- -54%
w/o match vars 2557545/s 116% --
N = 10:
Rate w/ match vars w/o match vars
w/ match vars 1164144/s -- -49%
w/o match vars 2283105/s 96% --
N = 100:
Rate w/ match vars w/o match vars
w/ match vars 865052/s -- -45%
w/o match vars 1560062/s 80% --
N = 1000:
Rate w/ match vars w/o match vars
w/ match vars 224568/s -- -21%
w/o match vars 284333/s 27% --
N = 10000:
Rate w/ match vars w/o match vars
w/ match vars 26667/s -- -15%
w/o match vars 31480/s 18% --
Upvotes: 5