Eric Strom
Eric Strom

Reputation: 40152

What is a good test case showing the regex performance problems of the pre/post/match vars in Perl?

In the English module and a few other places, users are advised never to use the $& $` and $' variables or their English equivalents $MATCH $PREMATCH $POSTMATCH due to the fact that they will slow down all regex use.

What is a good test case (benchmark) that shows the performance problems?

Upvotes: 2

Views: 306

Answers (1)

Michael Carman
Michael Carman

Reputation: 30841

Here's a simple starting point, looking for a single character in strings of varying lengths. The match variables make copies of the source string so I expected the penalty to be proportional to the amount of copying required. Reality seems to be the opposite. (This is why we benchmark, children.) The cost of matching against a longer string outweighs the overhead of making a copy. In retrospect, that makes sense, as the copy is just a memcpy while the regex engine has to scan character-by-character.

use 5.010;
use strict;
use warnings;
use Benchmark qw(cmpthese);

for my $n (map { 10 ** $_ } 0 .. 4) {
    my $string = 'a' x $n . 0 . 'a' x $n;

    print "N = $n:\n";
    cmpthese(1000000, {
        'w/ match vars'  => sub { $string =~ /\d/p },
        'w/o match vars' => sub { $string =~ /\d/  },
    });
    print "\n";
}

Results:

N = 1:
            (warning: too few iterations for a reliable count)
                    Rate  w/ match vars w/o match vars
w/ match vars  1184834/s             --           -54%
w/o match vars 2557545/s           116%             --

N = 10:
                    Rate  w/ match vars w/o match vars
w/ match vars  1164144/s             --           -49%
w/o match vars 2283105/s            96%             --

N = 100:
                    Rate  w/ match vars w/o match vars
w/ match vars   865052/s             --           -45%
w/o match vars 1560062/s            80%             --

N = 1000:
                   Rate  w/ match vars w/o match vars
w/ match vars  224568/s             --           -21%
w/o match vars 284333/s            27%             --

N = 10000:
                  Rate  w/ match vars w/o match vars
w/ match vars  26667/s             --           -15%
w/o match vars 31480/s            18%             --

Upvotes: 5

Related Questions