Reputation: 5059
I found A clever trick to prealloc memory for a string, however the
following code snippet perform worse than without the trick (by commenting out the statement with vec($str, 0x100000, 8)=0;
.
use Time::HiRes qw( gettimeofday );
my $big = "a" x 100;
my $str = "";
vec($str, 0x100000, 8)=0;
$ts = getTS();
for ($i=0; $i < 1000000; $i ++) {
$str = "";
for ($j=0; $j<100; $j++) {
$str .= $big;
}
}
printf "took %f secs\n", getTS() - $ts;
sub getTS {
my ($seconds, $microseconds) = gettimeofday;
return $seconds + (0.0+ $microseconds)/1000000.0;
}
With the clever trick, it took 9.1 secs. Without the clever trick, it took 7.8 secs.
The clever trick should have been faster because it doesn't need to make so many realloc()
. Any idea why?
Upvotes: 0
Views: 138
Reputation: 98378
Calling vec() is an extra expensive operation; you have to be saving a whole lot of realloc data-moving to make it worth it. I'm not sure why you have nested loops in your code; any reallocs necessary will only be done in the first run of the inner loop, not later runs of it. My benchmark of your code, adjusted to have vec only allocate the buffer you actually need, shows the vec version as marginally slower:
use strict;
use warnings;
use Benchmark 'cmpthese';
cmpthese( 10, {
'with_vec' => sub {
my $big = "a" x 100;
my $str;
undef $str; # start with no string buffer for benchmarking purposes
vec($str, 9999, 8)=0;
for (my $i=0; $i < 1000000; $i ++) {
$str = "";
for (my $j=0; $j<100; $j++) {
$str .= $big;
}
}
},
'without_vec' => sub {
my $big = "a" x 100;
my $str;
undef $str; # start with no string buffer for benchmarking purposes
vec($str, 9999, 8)=0;
for (my $i=0; $i < 1000000; $i ++) {
$str = "";
for (my $j=0; $j<100; $j++) {
$str .= $big;
}
}
},
});
Producing:
s/iter without_vec with_vec
without_vec 8.43 -- -3%
with_vec 8.15 3% --
(though occasionally with_vec was faster)
(undef $str
forces the code to use a fresh string buffer each time; without that, $str
's buffer size expands to its maximum the first time Benchmark runs the code and remains the same thereafter.)
Here's an adjusted example where preallocating does make a difference:
cmpthese( -10, {
'with_vec' => sub {
my $big = "a" x 1;
my $str;
undef $str;
vec($str, 9999999, 8)=0;
$str = "";
for (my $j=0; $j<10000000; $j++) {
$str .= $big;
}
},
'without_vec' => sub {
my $big = "a" x 1;
my $str;
undef $str;
$str = "";
for (my $j=0; $j<10000000; $j++) {
$str .= $big;
}
},
});
Producing:
Rate with_vec without_vec
with_vec 1.29/s -- -3%
without_vec 1.33/s 3% --
(though results were erratic; a third of the time without_vec was faster).
Upvotes: 1
Reputation: 385496
Your test makes no sense. Your vec
only has an effect when $i=0
—the first pass of the loop has the same affect as vec
for the latter passes of the loop— so vec
's pre-allocation only makes a difference for 1/1,000,000 of the time your program is executing! That means the 1.2s difference has noting to do with whether $str
's string buffer is pre-allocated or not.
Did you just run each test once? That's not an appropriate way of doing a benchmark! If you run a proper test, you'll see that pre-allocating doesn't help —the gain is so minor it gets lost— but it doesn't hurt either; it simply has no effect.
Rate deoptimized baseline preallocated
deoptimized 78084/s -- -1% -1%
baseline 78668/s 1% -- -0%
preallocated 78928/s 1% 0% --
Test:
use strict;
use warnings;
use Benchmark qw( cmpthese );
my $big = "a" x 100;
my $preallocated;
vec($preallocated, 0x100000, 8)=0;
cmpthese(-3, {
deoptimized => sub {
undef(my $str);
$str .= $big for 1..100;
},
baseline => sub {
my $str;
$str .= $big for 1..100;
},
preallocated => sub {
$preallocated = "";
$preallocated .= $big for 1..100;
},
});
I'm not saying pre-allocating never helps. There could be scenarios where it does —larger numbers?— just not here.
One of the reasons it has little effect is that Perl allocates exponentially more memory, which is to say the number of allocations increases only logarithmically as the loop sizes grow. The following shows only 21 reallocs for the 100 loop passes:
use strict;
use warnings;
use feature qw( say );
use B qw( svref_2object );
sub SvLEN(\$) { svref_2object($_[0])->LEN }
my $big = "a" x 100;
my $str = "";
my $incs = 0;
for (1..100) {
my $len1 = SvLEN($str);
$str .= $big;
my $len2 = SvLEN($str);
my $len_inc = $len2 - $len1;
#say $len1, " ", $len_inc;
++$incs if $len_inc;
}
say $incs; # 21
Upvotes: 2
Reputation: 126722
I suggest that you should avoid clever tricks. Perl's handling of string memory has improved vastly in ten years: it now pre-expands every string proportionally to its original size, and retains any memory allocated in case the program repeats the same behaviour
You can squeeze another ten percent performance out of the algorithm by using lexical variables and avoiding the C-style for
loop
Also, Time::HiRes
already provides tv_interval
for calculating the difference between two calls to gettimeofday
use strict;
use warnings 'all';
use Time::HiRes qw/ gettimeofday tv_interval /;
my $big = 'a' x 100;
my $start = [ gettimeofday ];
for my $i (1 .. 1_000_000 ) {
my $str;
for my $j ( 1 .. 100 ) {
$str .= $big;
}
}
my $end = [ gettimeofday ];
printf "took %.3f secs\n", tv_interval( $start, $end );
took 8.324 secs
Incidentally, the same program running on my Pixel C tablet running Android 7.1.2 on an ARM processor returned 21.683s. I think that's pretty good going.
Upvotes: 5