Daisy Sophia Hollman
Daisy Sophia Hollman

Reputation: 6296

A couple of Perl subtleties

I've been programming in Perl for a while, but I never have understood a couple of subtleties about Perl:

The use and the setting/unsetting of the $_ variable confuses me. For instance, why does

# ...
shift @queue;
($item1, @rest) = split /,/;

work, but (at least for me)

# ...
shift @queue;
/some_pattern.*/ or die();

does not seem to work?

Also, I don't understand the difference between iterating through a file using foreach versus while. For instance,I seem to be getting different results for

while(<SOME_FILE>){  
    # Do something involving $_        
}

and

foreach (<SOME_FILE>){
    # Do something involving $_
}

Can anyone explain these subtle differences?

Upvotes: 2

Views: 292

Answers (8)

Michael Krebs
Michael Krebs

Reputation: 8206

Another, albeit subtle, difference between:

while (<FILE>) {
}

and:

foreach (<FILE>) {
}

is that while() will modify the value of $_ outside of its scope, whereas, foreach() makes $_ local. For example, the following will die:

$_ = "test";
while (<FILE1>) {
    print "$_";
}
die if $_ ne "test";

whereas, this will not:

$_ = "test";
foreach (<FILE1>) {
    print "$_";
}
die if $_ ne "test";

This becomes more important with more complex scripts. Imagine something like:

sub func1() {
    while (<$fh2>) {  # clobbers $_ set from <$fh1> below
        <...>
    }
}

while (<$fh1>) {
    func1();
    <...>
}

Personally, I stay away from using $_ for this reason, in addition to it being less readable, etc.

Upvotes: 5

Alan Haggai Alavi
Alan Haggai Alavi

Reputation: 74272

Please read perldoc perlvar so that you will have an idea of the different variables in Perl.

perldoc perlvar.

Upvotes: 0

Sinan &#220;n&#252;r
Sinan &#220;n&#252;r

Reputation: 118166

shift @queue;
($item1, @rest) = split /,/;

If I understand you correctly, you seem to think that this shifts off an element from @queue to $_. That is not true.

The value that is shifted off of @queue simply disappears The following split operates on whatever is contained in $_ (which is independent of the shift invocation).

while(<SOME_FILE>){  
    # Do something involving $_        
}

Reading from a filehandle in a while statement is special: It is equivalent to

while ( defined( $_ = readline *SOME_FILE ) ) {

This way, you can process even colossal files line-by-line.

On the other hand,

for(<SOME_FILE>){  
    # Do something involving $_        
}

will first load the entire file as a list of lines into memory. Try a 1GB file and see the difference.

Upvotes: 13

Nathan Fellman
Nathan Fellman

Reputation: 127598

Regarding the 2nd question:

while (<FILE>) {
}

and

foreach (<FILE>) {
}

Have the same functional behavior, including setting $_. The difference is that while() evaluates <FILE> in a scalar context, while foreach() evaluates <FILE> in a list context. Consider the difference between:

$x = <FILE>;

and

@x = <FILE>;

In the first case, $x gets the first line of FILE, and in the second case @x gets the entire file. Each entry in @x is a different line in FILE.

So, if FILE is very big, you'll waste memory slurping it all at once using foreach (<FILE>) compared to while (<FILE>). This may or may not be an issue for you.

The place where it really matters is if FILE is a pipe descriptor, as in:

open FILE, "some_shell_program|";

Now foreach(<FILE>) must wait for some_shell_program to complete before it can enter the loop, while while(<FILE>) can read the output of some_shell_program one line at a time and execute in parallel to some_shell_program.

That said, the behavior with regard to $_ remains unchanged between the two forms.

Upvotes: 3

Brad Gilbert
Brad Gilbert

Reputation: 34130

while only checks if the value is true, for also places the value in $_, except in some circumstances. For example <> will set $_ if used in a while loop.

to get similar behaviour of:

foreach(qw'a b c'){
    # Do something involving $_
}

You have to set $_ explicitly.

while( $_ = shift @{[ qw'a b c' ]} ){  
    # Do something involving $_        
}

It is better to explicitly set your variables

for my $line(<SOME_FILE>){
}

or better yet

while( my $line = <SOME_FILE> ){
}

which will only read in the file one line at a time.


Also shift doesn't set $_ unless you specifically ask it too

$_ = shift @_;

And split works on $_ by default. If used in scalar, or void context will populate @_.

Upvotes: 0

cms
cms

Reputation: 5992

It is to avoid this sort of confusion that it's considered better form to avoid using the implicit $_ constructions.

my $element = shift @queue;
($item,@rest) = split /,/ , $element;

or

($item,@rest) = split /,/, shift @queue;

likewise

while(my $foo = <SOMEFILE>){

do something 

}

or

foreach my $thing(<FILEHANDLE>){

  do something

}

Upvotes: 0

woolstar
woolstar

Reputation: 5083

foreach evaluates the entire list up front. while evaluates the condition to see if its true each pass. while should be considered for incremental operations, foreach only for list sources.

For example:

my $t= time() + 10 ;
while ( $t > time() ) { # do something }

Upvotes: 2

Related Questions