Reputation: 3965
I wanted to remove duplicate values from an array with this approach. The duplicates removal have to be executed inside a loop. Here is a minimal example that demonstrates the problem that I encountered:
use strict;
for (0..1){
my %seen;
sub testUnique{
return !$seen{shift}++;
}
my $res = testUnique(1);
if($res){
print "unique\n";
}else{
print "non-unique\n";
}
}
I define the %seen
hash inside a loop, so I would expect it to be defined only during a single iteration of the loop.
The result of the above code is however:
unique
non-unique
With some debug prints, I found out that the value of %seen
is preserved from one iteration to another.
I tried a trivial
for (0..1){
my %seen;
$seen{1}++;
print "$seen{1}\n";
}
And this one worked as expected. It printed:
1
1
So, I guess the problem is with inner function testUnique
.
Can somebody explain me what is going on here?
Upvotes: 2
Views: 99
Reputation: 385819
Welcome to the world of closures.
sub make_closure {
my $counter = 0;
return sub { return ++$counter };
}
my $counter1 = make_closure();
my $counter2 = make_closure();
say $counter1->(); # 1
say $counter1->(); # 2
say $counter1->(); # 3
say $counter2->(); # 1
say $counter2->(); # 2
say $counter1->(); # 4
sub { }
captures lexical variables that are in scope, giving the sub access to them even when the scope in which they exist is gone.
You use this ability every day without knowing it.
my $foo = ...;
sub print_foo { print "$foo\n"; }
If subs didn't capture, the above wouldn't work in a module since the file's lexical scope is normally exited before any of the functions in the module are called.
Not only does print_foo
need to capture $foo
for the above to work, it must do so when it's compiled.
sub testUnique {
return !$seen{shift}++;
}
is basically the same thing
BEGIN {
*testUnique = sub {
return !$seen{shift}++;
};
}
which means that sub { }
is executed at compile time, which means it captures %seen
that existed at compile time, meaning before the loop has even started.
The first pass of the loop will use that same %seen
, but a new %seen
will be created for each subsequent pass to allow things like
my @outer;
for (...) {
my @inner = ...;
push @outer, \@inner;
}
If you executed the sub { }
at run-time, there'd be no problem.
for (0..1){
my %seen;
local *testUnique = sub {
return !$seen{shift}++;
};
my $res = testUnique(1);
if($res){
print "unique\n";
}else{
print "non-unique\n";
}
}
Upvotes: 2
Reputation: 118128
Your testUnique
sub closes over the first instance of %seen
. Even though it is inside the for
loop, the subroutine does not get compiled repeatedly.
Your code is compiled once, including the part that says initialize a lexically scoped variable %hash
right at the top of the for
loop.
The following will produce the output you want, but I am not sure I see are going down this path:
#!/usr/bin/env perl
use warnings;
use strict;
for (0..1){
my %seen;
my $tester = sub {
return !$seen{shift}++;
};
print $tester->(1) ? "unique\n" : "not unique\n";
}
Upvotes: 4
Reputation: 4445
The subroutine can only be defined once, and isn't re-created for each iteration of the loop. As a result, it only holds a reference to the initial %seen
hash. Adding some output helps to clarify this:
use strict;
use warnings;
for(0 .. 1) {
my %seen = ();
print "Just created " . \%seen . "\n";
sub testUnique {
print "Testing " . \%seen . "\n";
return ! $seen{shift} ++;
}
if(testUnique(1)) {
print "unique\n";
}
else {
print "non-unique\n";
}
}
Output:
Just created HASH(0x994fc18)
Testing HASH(0x994fc18)
unique
Just created HASH(0x993f048)
Testing HASH(0x994fc18)
non-unique
Here it can be seen that the initial hash is the only one tested.
Upvotes: 3