Reputation: 9064
How fast is string concatenation in Perl? Is it linear to the length of the second operand? If so, what conditions need to be met for this operation to be linear? What are the examples on non-linear concatenation time?
And what about string assignment? When and where does the actual copy of the buffer occurs?
What about other operations like substring or simple regexes?
Upvotes: 3
Views: 1387
Reputation: 9064
I tested this myself. Concatenation is linear to the length of the second argument but assignment is always linear to the length of the string.
It looks like Perl does not count references for strings but associates a buffer with every variable (reference).
Here are some test results:
Concatenation seems to be constant and entire test is linear:
248ms my $x; $x .= "a" for 1..2_000_000
501ms my $x; $x .= "a" for 1..4_000_000
967ms my $x; $x .= "a" for 1..8_000_000
$x = $x . $y
seems to be optimized and uses $x
buffer in this case:
295ms my $x; $x = $x . "a" for 1..2_000_000
592ms my $x; $x = $x . "a" for 1..4_000_000
1170ms my $x; $x = $x . "a" for 1..8_000_000
Previous optimization seems to be done statically so concatenation in next test is linear to the resulting string length and entire test is quadratic:
233ms my $x; ${\$x} = ${\$x} . "a" for 1..40_000
951ms my $x; ${\$x} = ${\$x} . "a" for 1..80_000
3811ms my $x; ${\$x} = ${\$x} . "a" for 1..160_000
Copying is linear:
186ms my $x; for (1..50_000) { $x .= "a"; my $y = $x }
764ms my $x; for (1..100_000) { $x .= "a"; my $y = $x }
3029ms my $x; for (1..200_000) { $x .= "a"; my $y = $x }
Every copy is linear, reference counting is not used for strings:
545ms my $x; for (1..50_000) { $x .= "a"; my $y = $x; my $y2 = $x; my $y3 = $x }
2264ms my $x; for (1..100_000) { $x .= "a"; my $y = $x; my $y2 = $x; my $y3 = $x }
8951ms my $x; for (1..200_000) { $x .= "a"; my $y = $x; my $y2 = $x; my $y3 = $x }
Upvotes: 2
Reputation: 9697
This is really complex question and answer depends on far many factors (architecture, underlying OS, HW, Perl compilation flags, etc.)
To get an idea, you can take a look at internals of perl structures used to represent your variables. Good source is perlguts illustrated.
If you have specific implementation in mind, try benchmarking your code:
use Benchmark qw(:all);
my $a = "Some string";
my @b = map { "Some string to append " x $_ } (1..10);
cmpthese(-1, {
( map {+ "concat_$_" => sub { my $c = $a . $b[$_] } } (1..10) )
});
The thing above compares operation my $c = $a . $b
for various length of second argument. From result it can be seen that for this length ranges the operation runs roughly in linear time.
Upvotes: 5