mandel
mandel

Reputation: 2947

Is $_ more efficient than a named variable in Perl's foreach?

I am quite new in Perl and I woud like to know which of the following loops is more efficient:

my @numbers = (1,3,5,7,9);
foreach my $current (@numbers){
    print "$current\n";
}

or

my @numbers = (1,3,5,7,9);
foreach (@numbers){
    print "$_\n";
}

I want to know this in order to know if the use of $_ is more efficient because is place in a register because is commonly used or not. I have written some code and I'm trying to clean it up and I've found out that I'm using the first loop more often than the second one.

Upvotes: 8

Views: 1648

Answers (8)

drby
drby

Reputation: 2640

Benchmark:

use Benchmark qw(timethese cmpthese);

my $iterations = 500000;     

cmpthese( $iterations,
  {
    'Loop 1' => 'my @numbers = (1,3,5,7,9);
    foreach my $current (@numbers)
    {
      print "$current\n";
    }', 

    'Loop 2' => 'my @numbers = (1,3,5,7,9);
    foreach (@numbers)
    {
      print "$_\n";
    }'
  }
);

Output:

         Rate     Loop 2 Loop 1
Loop 2  23375/s     --    -1%
Loop 1  23546/s     1%     --

I've run it a couple of times with varying results. I think it's safe to say that there isn't much of a difference.

Upvotes: 6

Brad Gilbert
Brad Gilbert

Reputation: 34120

Running the two options through "perl -MO=Concise,-terse,-src test.pl", results in these two OpTrees:

for my $n (@num){ ... }

LISTOP (0x9c08ea0) leave [1] 
    OP (0x9bad5e8) enter 
# 5: my @num = 1..9;
    COP (0x9b89668) nextstate 
    BINOP (0x9b86210) aassign [4] 
        UNOP (0x9bacfa0) null [142] 
            OP (0x9b905e0) pushmark 
            UNOP (0x9bad5c8) rv2av 
                SVOP (0x9bacf80) const [5] AV (0x9bd81b0) 
        UNOP (0x9b895c0) null [142] 
            OP (0x9bd95f8) pushmark 
            OP (0x9b4b020) padav [1] 
# 6: for my $n (@num){
    COP (0x9bd12a0) nextstate 
    BINOP (0x9c08b48) leaveloop 
        LOOP (0x9b1e820) enteriter [6] 
            OP (0x9b1e808) null [3] 
            UNOP (0x9bd1188) null [142] 
                OP (0x9bb5ab0) pushmark 
                OP (0x9b8c278) padav [1] 
        UNOP (0x9bdc290) null 
            LOGOP (0x9bdc2b0) and 
                OP (0x9b1e458) iter 
                LISTOP (0x9b859b8) lineseq 
# 7:   say $n;
                    COP (0x9be4f18) nextstate 
                    LISTOP (0x9b277c0) say 
                        OP (0x9c0edd0) pushmark 
                        OP (0x9bda658) padsv [6] # <===
                    OP (0x9b8a2f8) unstack 

for(@num){ ... }

LISTOP (0x8cdbea0) leave [1] 
    OP (0x8c805e8) enter 
# 5: my @num = 1..9;
    COP (0x8c5c668) nextstate 
    BINOP (0x8c59210) aassign [4] 
        UNOP (0x8c7ffa0) null [142] 
            OP (0x8ccc1f0) pushmark 
            UNOP (0x8c805c8) rv2av 
                SVOP (0x8c7ff80) const [7] AV (0x8cab1b0) 
        UNOP (0x8c5c5c0) null [142] 
            OP (0x8cac5f8) pushmark 
            OP (0x8c5f278) padav [1] 
# 6: for (@num){
    COP (0x8cb7f18) nextstate 
    BINOP (0x8ce1de8) leaveloop 
        LOOP (0x8bf1820) enteriter 
            OP (0x8bf1458) null [3] 
            UNOP (0x8caf2b0) null [142] 
                OP (0x8bf1808) pushmark 
                OP (0x8c88ab0) padav [1] 
            PADOP (0x8ca4188) gv  GV (0x8bd7810) *_ # <===
        UNOP (0x8cdbb48) null 
            LOGOP (0x8caf290) and 
                OP (0x8ce1dd0) iter 
                LISTOP (0x8c62aa8) lineseq 
# 7:   say $_;
                    COP (0x8cade88) nextstate 
                    LISTOP (0x8bf12d0) say 
                        OP (0x8cad658) pushmark 
                        UNOP (0x8c589b8) null [15] # <===
                            PADOP (0x8bfa7c0) gvsv  GV (0x8bd7810) *_ # <===
                    OP (0x8bf9a10) unstack 

I've added "<===" to mark the differences between the two.

If you notice there are actually more ops for the "for(@num){...}" version.

So if anything the "for(@num){...}" version is probably slower than "for my $n (@num){...}" version.

Upvotes: 2

David Schmitt
David Schmitt

Reputation: 59316

Using $_ is a Perl idiom, which shows the seasoned programmer that the "current context" is used. Also, many functions take $_ by default as parameter, thus making code more concise.

Some might also just argue, that "it was hard to write, it should be hard to read".

Upvotes: 1

Hynek -Pichi- Vychodil
Hynek -Pichi- Vychodil

Reputation: 26121

Even know Premature optimisation is the root of all evil

{
  local $\ = "\n";
  print foreach @numbers;
}

but some expectations can be wrong. Test is little bit weird because output can make some weird side-effects and order can be important.

#!/usr/bin/env perl
use strict;
use warnings;
use Benchmark qw(:all :hireswallclock);

use constant Numbers => 10000;

my @numbers = (1 .. Numbers);

sub no_out (&) {
    local *STDOUT;
    open STDOUT, '>', '/dev/null';
    my $result  = shift()->();
    close STDOUT;
    return $result;
};

my %tests = (
    loop1 => sub {
        foreach my $current (@numbers) {
            print "$current\n";
        }
    },
    loop2 => sub {
        foreach (@numbers) {
            print "$_\n";
        }

    },
    loop3 => sub {
        local $\ = "\n";
        print foreach @numbers;
        }
);

sub permutations {
    return [
        map {
            my $a = $_;
            my @f = grep {$a ne $_} @_;
            map { [$a, @$_] } @{ permutations( @f ) }
            } @_
        ]
        if @_;
    return [[]];
}

foreach my $p ( @{ permutations( keys %tests ) } ) {
    my $result = {
        map {
            $_ => no_out { sleep 1; countit( 2, $tests{$_} ) }
            } @$p
    };

    cmpthese($result);
}

One can expect that loop2 should be faster than loop1

       Rate loop2 loop1 loop3
loop2 322/s    --   -2%  -34%
loop1 328/s    2%    --  -33%
loop3 486/s   51%   48%    --
       Rate loop2 loop1 loop3
loop2 322/s    --   -0%  -34%
loop1 323/s    0%    --  -34%
loop3 486/s   51%   50%    --
       Rate loop2 loop1 loop3
loop2 323/s    --   -0%  -33%
loop1 324/s    0%    --  -33%
loop3 484/s   50%   49%    --
       Rate loop2 loop1 loop3
loop2 317/s    --   -3%  -35%
loop1 328/s    3%    --  -33%
loop3 488/s   54%   49%    --
       Rate loop2 loop1 loop3
loop2 323/s    --   -2%  -34%
loop1 329/s    2%    --  -33%
loop3 489/s   51%   49%    --
       Rate loop2 loop1 loop3
loop2 325/s    --   -1%  -33%
loop1 329/s    1%    --  -32%
loop3 488/s   50%   48%    --

Sometimes I observed consistently loop1 about 15%-20% faster than loop2 but I can't determine why.

I was observed generated byte-code for loop1 and loop2 and there is difference only one when creating my variable. This variable interior is not allocated and also not copied thus this operation is very cheap. Difference comes I think only from "$_\n" construct which is not cheap. These loops should be very similar

for (@numbers) {
  ...
}

for my $a (@numbers) {
  ...
}

but this loop is more expensive

for (@numbers) {
  my $a = $_;
  ...
}

and also

print "$a\n";

is more expensive than

print $a, "\n";

Upvotes: 11

Joe Casadonte
Joe Casadonte

Reputation: 16859

I more interested in the general idea of using $_ rather than printing...

As a side note, Perl Best Practices is a good place to go to if you want to start learning which idioms to avoid and why. I don't agree with everything he writes, but he's spot on most times.

Upvotes: 2

tunnuz
tunnuz

Reputation: 23978

I don't know but ... well first of all you save a variable assignment in the second version of the loop. I can imagine that since $_ is used very often it should be somehow optimized. You could try to profile it, a very good Perl profiler is NYTProf 2 written by Tim Bunce.

Then, is it really worthy to optimize this small things? I don't think that a loop will make a difference. I suggest you to use the profiler to measure your performance and identify the real bottlenecks. Usually the speed problems are located in the 10% of the code that is running the 90% of the time (maybe will not be 10-90, but this is the "famous" ratio :P).

Upvotes: 0

tddmonkey
tddmonkey

Reputation: 21184

Have you identified that there is a performance problem in sections of code that are making use of these loops? If not, you want to go for the one that is more readable and thus more maintainable. Any difference in speed will probably be negligible, especially compared to other parts of your system. Always code for maintainability first, then profile, then code for performance

"Premature optimisation is the root of all evil"[1]

[1] Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.

Upvotes: 14

schnaader
schnaader

Reputation: 49719

You could have a look at this tutorial, there also is a chapter "Benchmark Your Code" you could use to compare those two ways.

Upvotes: 6

Related Questions