mob
mob

Reputation: 118665

Can I copy a hash without resetting its "each" iterator?

I am using each to iterate through a Perl hash:

while (my ($key,$val) = each %hash) {
   ...
}

Then something interesting happens and I want to print out the hash. At first I consider something like:

while (my ($key,$val) = each %hash) {
   if (something_interesting_happens()) {
      foreach my $k (keys %hash) { print "$k => $hash{$k}\n" }
   }
}

But that won't work, because everyone knows that calling keys (or values) on a hash resets the internal iterator used for each, and we may get an infinite loop. For example, these scripts will run forever:

perl -e '%a=(foo=>1); while(each %a){keys %a}'
perl -e '%a=(foo=>1); while(each %a){values %a}'

No problem, I thought. I could make a copy of the hash, and print out the copy.

   if (something_interesting_happens()) {
      %hash2 = %hash;
      foreach my $k (keys %hash2) { print "$k => $hash2{$k}\n" }
   }

But that doesn't work, either. This also resets the each iterator. In fact, any use of %hash in a list context seems to reset its each iterator. So these run forever, too:

perl -e '%a=(foo=>1); while(each %a){%b = %a}'
perl -e '%a=(foo=>1); while(each %a){@b = %a}'
perl -e '%a=(foo=>1); while(each %a){print %a}'

Is this documented anywhere? It makes sense that perl might need to use the same internal iterator to push a hash's contents onto a return stack, but I can also imagine hash implementations that didn't need to do that.

More importantly, is there any way to do what I want? To get to all the elements of a hash without resetting the each iterator?


This also suggests you can't debug a hash inside an each iteration, either. Consider running the debugger on:

%a = (foo => 123, bar => 456);
while ( ($k,$v) = each %a ) {
    $DB::single = 1;
    $o .= "$k,$v;";
}
print $o;

Just by inspecting the hash where the debugger stops (say, typing p %a or x %a), you will change the output of the program.


Update: I uploaded Hash::SafeKeys as a general solution to this problem. Thanks @gpojd for pointing me in the right direction and @cjm for a suggestion that made the solution much simpler.

Upvotes: 15

Views: 556

Answers (4)

LeoNerd
LeoNerd

Reputation: 8532

Not really. each is incredibly fragile. It stores iteration state on the iterated hash itself, state which is reused by other parts of perl when they need it. Far safer is to forget that it exists, and always iterate your own list from the result of keys %hash instead, because the iteration state over a list is stored lexically as part of the for loop itself, so is immune from corruption by other things.

Upvotes: 1

Zaid
Zaid

Reputation: 37146

Let's not forget that keys %hash is already defined when you enter the while loop. One could have simply saved the keys into an array for later use:

my @keys = keys %hash;

while (my ($key,$val) = each %hash) {

    if (something_interesting_happens()) {

        print "$_ => $hash{$_}\n" for @keys;
    }
}

Downside:

  • It's less elegant (subjective)
  • It won't work if %hash is modified (but then why would one use each in the first place?)

Upside:

  • It uses less memory by avoiding hash-copying

Upvotes: 1

gpojd
gpojd

Reputation: 23085

Have you tried Storable's dclone to copy it? It would probably be something like this:

use Storable qw(dclone);
my %hash_copy = %{ dclone( \%hash ) };

Upvotes: 9

Mark Reed
Mark Reed

Reputation: 95335

How big is this hash? How long does it take to iterate through it, such that you care about the timing of the access?

Just set a flag and do the action after the end of the iteration:

my $print_it;
while (my ($key,$val) = each %hash) {
    $print_it = 1 if something_interesting_happens();
    ...
}

if ($print_it) {
    foreach my $k (keys %hash) { print "$k => $hash{$k}\n" }
}

Although there's no reason not to use each in the printout code, too, unless you were planning on sorting by key or something.

Upvotes: 2

Related Questions