agsamek
agsamek

Reputation: 9064

how fast are string operations in Perl? In particular concatenation and assignment

How fast is string concatenation in Perl? Is it linear to the length of the second operand? If so, what conditions need to be met for this operation to be linear? What are the examples on non-linear concatenation time?

And what about string assignment? When and where does the actual copy of the buffer occurs?

What about other operations like substring or simple regexes?

Upvotes: 3

Views: 1387

Answers (2)

agsamek
agsamek

Reputation: 9064

I tested this myself. Concatenation is linear to the length of the second argument but assignment is always linear to the length of the string.

It looks like Perl does not count references for strings but associates a buffer with every variable (reference).

Here are some test results:

Concatenation seems to be constant and entire test is linear:

 248ms   my $x; $x .= "a" for 1..2_000_000
 501ms   my $x; $x .= "a" for 1..4_000_000
 967ms   my $x; $x .= "a" for 1..8_000_000

$x = $x . $y seems to be optimized and uses $x buffer in this case:

 295ms   my $x; $x = $x . "a" for 1..2_000_000
 592ms   my $x; $x = $x . "a" for 1..4_000_000
1170ms   my $x; $x = $x . "a" for 1..8_000_000

Previous optimization seems to be done statically so concatenation in next test is linear to the resulting string length and entire test is quadratic:

 233ms   my $x; ${\$x} = ${\$x} . "a" for 1..40_000
 951ms   my $x; ${\$x} = ${\$x} . "a" for 1..80_000
3811ms   my $x; ${\$x} = ${\$x} . "a" for 1..160_000

Copying is linear:

 186ms   my $x; for (1..50_000) { $x .= "a"; my $y = $x }
 764ms   my $x; for (1..100_000) { $x .= "a"; my $y = $x }
3029ms   my $x; for (1..200_000) { $x .= "a"; my $y = $x }

Every copy is linear, reference counting is not used for strings:

 545ms   my $x; for (1..50_000) { $x .= "a"; my $y = $x; my $y2 = $x; my $y3 = $x }
2264ms   my $x; for (1..100_000) { $x .= "a"; my $y = $x; my $y2 = $x; my $y3 = $x }
8951ms   my $x; for (1..200_000) { $x .= "a"; my $y = $x; my $y2 = $x; my $y3 = $x }

Upvotes: 2

bvr
bvr

Reputation: 9697

This is really complex question and answer depends on far many factors (architecture, underlying OS, HW, Perl compilation flags, etc.)

To get an idea, you can take a look at internals of perl structures used to represent your variables. Good source is perlguts illustrated.

If you have specific implementation in mind, try benchmarking your code:

use Benchmark qw(:all);

my $a = "Some string";
my @b = map { "Some string to append " x $_ } (1..10);

cmpthese(-1, {
    ( map {+ "concat_$_" => sub { my $c = $a . $b[$_] } } (1..10) )
});

The thing above compares operation my $c = $a . $b for various length of second argument. From result it can be seen that for this length ranges the operation runs roughly in linear time.

Upvotes: 5

Related Questions