Joe Casadonte
Joe Casadonte

Reputation: 16859

Why should you NOT return an array ref?

In the question "Is returning a whole array from a Perl subroutine inefficient" two people recommend against optimizing if there is no need for it. As a general rule, optimizing can add complexity, and if it's not needed, simple is better. But in this specific case, returning an array versus an array ref, I don't see that there's any added complexity, and I think consistency in the interface design would be more important. Consequently, I almost always do something like:

sub foo
{
   my($result) = [];

   #....build up the result array ref

   $result;
}

Is there a reason I should not do this, even for small results?

Upvotes: 12

Views: 12959

Answers (12)

brian d foy
brian d foy

Reputation: 132832

You shouldn't return an array reference if it's inconsistent with the rest of your interface. If everything else that you work with returns lists instead of references, don't be the odd duck who causes other programmers to remember the exception.

Unless you have large lists, this is really a micro-optimization issue. You should be so lucky if this is the bottleneck in your program.

As far as complexity goes, the difference between a reference and a list is so far down on the complexity scale that you have bigger problems if your programmers are struggling with that. Complicated algorithms and workflows are complex, but this is just syntax.

Having said all of that, I tend to make everything return references and make interfaces consistent with that.

Upvotes: 23

Axeman
Axeman

Reputation: 29854

I just want to comment on the idea about clumsy syntax of handling an array reference as opposed to a list. As brian mentioned, you really shouldn't do it, if the rest of the system is using lists. It's an unneeded optimization in most cases.

However, if that is not the case, and you are free to create your own style, then one thing that can make the coding less smelly is using autobox. autobox turns SCALAR, ARRAY and HASH (as well as others) into "packages", such that you can code:

my ( $name, $number ) = $obj->get_arrayref()->items( 0, 1 );

instead of the slightly more clumsy:

my ( $name, $number ) = @{ $obj->get_arrayref() };

by coding something like this:

sub ARRAY::slice { 
    my $arr_ref = shift;
    my $length  = @$arr_ref;
    my @subs    = map { abs($_) < $length ? $_ : $_ < 0 ? 0 : $#$arr_ref } @_;
    given ( scalar @subs ) { 
        when ( 0 ) { return $arr_ref; }
        when ( 2 ) { return [ @{$arr_ref}[ $subs[0]..$subs[1] ] ]; }
        default    { return [ @{$arr_ref}[ @subs ] ]; }
    }
    return $arr_ref; # should not get here.
}

sub ARRAY::items { return @{ &ARRAY::slice }; }

Keep in mind that autobox requires you to implement all the behaviors you want from these. $arr_ref->pop() doesn't work until you define sub ARRAY::pop unless you use autobox::Core

Upvotes: 2

Igor
Igor

Reputation: 4838

Since nobody mentioned about wantarray, I will :-)

I consider a good practice to let the caller decide what context it wants the result. For instance, in the code below, you ask perl for the context the subroutine was called and decide what to return.

sub get_things {
    my @things;
    ... # populate things
    return wantarray ? @things : \@things;
}

Then

for my $thing ( get_things() ) {
    ...
}

and

my @things = get_things();

works properly because of the list context, and:

my $things = get_things();

will return the array's reference.

For more info about wantarray you might want to check perldoc -f wantarray.

Edit: I over-sighted one of the first answers, which mentioned wantarray, but I think this is answer is still valid because it makes it a bit clearer.

Upvotes: 4

Greg Bacon
Greg Bacon

Reputation: 139531

An important omission in the above answers: don't return references to private data!

For example:

package MyClass;

sub new {
  my($class) = @_;
  bless { _things => [] } => $class;
}

sub add_things {
  my $self = shift;
  push @{ $self->{_things} } => @_;
}

sub things {
  my($self) = @_;
  $self->{_things};  # NO!
}

Yes, users can peek directly under the hood with Perl objects implemented this way, but don't make it easy for users to unwittingly shoot themselves in the foot, e.g.,

my $obj = MyClass->new;
$obj->add_things(1 .. 3);

...;

my $things = $obj->things;
my $first = shift @$things;

Better would be to return a (perhaps deep) copy of your private data, as in

sub things {
  my($self) = @_;
  @{ $self->{_things} };
}

Upvotes: 1

Schwern
Schwern

Reputation: 164919

I'll copy the relevant portion of my answer from the other question here.

The oft overlooked second consideration is the interface. How is the returned array going to be used? This is important because whole array dereferencing is kinda awful in Perl. For example:

for my $info (@{ getInfo($some, $args) }) {
    ...
}

That's ugly. This is much better.

for my $info ( getInfo($some, $args) ) {
    ...
}

It also lends itself to mapping and grepping.

my @info = grep { ... } getInfo($some, $args);

But returning an array ref can be handy if you're going to pick out individual elements:

my $address = getInfo($some, $args)->[2];

That's simpler than:

my $address = (getInfo($some, $args))[2];

Or:

my @info = getInfo($some, $args);
my $address = $info[2];

But at that point, you should question whether @info is truly a list or a hash.

my $address = getInfo($some, $args)->{address};

Unlike arrays vs array refs, there's little reason to choose to return a hash over a hash ref. Hash refs allow handy short-hand, like the code above. And opposite of arrays vs refs, it makes the iterator case simpler, or at least avoids a middle-man variable.

for my $key (keys %{some_func_that_returns_a_hash_ref}) {
    ...
}

What you should not do is have getInfo() return an array ref in scalar context and an array in list context. This muddles the traditional use of scalar context as array length which will surprise the user.

I would like to add that while making everything consistently do X is a good rule of thumb, it is not of paramount importance in designing a good interface. Go a bit too far with it and you can easily steamroll other more important concerns.

Finally, I will plug my own module, Method::Signatures, because it offers a compromise for passing in array references without having to use the array ref syntax.

use Method::Signatures;

method foo(\@args) {
    print "@args";      # @args is not a copy
    push @args, 42;   # this alters the caller array
}

my @nums = (1,2,3);
Class->foo(\@nums);   # prints 1 2 3
print "@nums";        # prints 1 2 3 42

This is done through the magic of Data::Alias.

Upvotes: 8

vy32
vy32

Reputation: 29655

If the array is constructed inside the function there is no reason to return the array; just return a reference, since the caller is guaranteed that there will only be one copy of it (it was just created).

If the function is considering a set of global arrays and returning one of them, then it's acceptable to return a reference if the caller will not modify it. If the caller might modify the array, and this is not desired, then the function should return a copy.

This really is a uniquely Perl problem. In Java you always return a reference, and the function prevent the array from being modified (if that is your goal) by finalizing both the array and the data that it contains. In python references are returned and there is no way to prevent them from being modified; if that's important, a reference to a copy is returned instead.

Upvotes: 2

daotoad
daotoad

Reputation: 27183

Returning an array gives some nice benefits:

my @foo = get_array(); # Get list and assign to array.
my $foo = get_array(); # Get magnitude of list.
my ($f1, $f2) = get_array(); # Get first two members of list.
my ($f3,$f6) = (get_array())[3,6]; # Get specific members of the list.

sub get_array {
   my @array = 0..9;

   return @array;
}

If you return array refs, you'll have to write several subs to do the same work. Also, an empty array returns false in a boolean context, but an empty array ref does not.

if ( get_array() ) {
    do_stuff();
}

If you return array refs, then you have to do:

if ( @{ get_array_ref() } ) {
    do_stuff();
}

Except if get_array_ref() fails to return a ref, say instead and undef value, you have a program halting crash. One of the following will help:

if ( @{ get_array() || [] } ) {
    do_stuff();
}

if ( eval{ @{get_array()} } ) {
    do_stuff();
}

So if the speed benefits are needed or if you need an array ref (perhaps you want to allow direct manipulation of an object's collection element--yuck, but sometimes it must happen), return an array ref. Otherwise, I find the benefits of standard arrays worth preserving.

Update: It is really important to remember that what you return from a routine is not always an array or a list. What you return is whatever follows the return, or the result of the last operation. Your return value will be evaluated in context. Most of the time, everything will be fine, but sometimes you can get unexpected behavior.

sub foo {
    return $_[0]..$_[1];
}

my $a = foo(9,20);
my @a = foo(9,20);

print "$a\n";
print "@a\n";

Compare with:

sub foo {
    my @foo = ($_[0]..$_[1]);
    return @foo;
}

my $a = foo(9,20);
my @a = foo(9,20);

print "$a\n";
print "@a\n";

So, when you say "return an array" be sure you really mean "return an array". Be aware of what you return from your routines.

Upvotes: 0

Frank
Frank

Reputation: 66194

Is there a reason I should not do this, even for small results?

There's not a perl-specific reason, meaning it's correct and efficient to return a reference to the local array. The only downside is that people who call your function have to deal with the returned array ref, and access elements with the arrow -> or dereference etc. So, it's slightly more troublesome for the caller.

Upvotes: 0

Brad Gilbert
Brad Gilbert

Reputation: 34120

I don't think you should feel constrained to only using one or two methods. You should however keep it consistent for each module, or set of modules.

Here are some examples to ponder on:

sub test1{
  my @arr;
  return @arr;
}
sub test2{
  my @arr;
  return @arr if wantarray;
  return \@arr;
}
sub test3{
  my %hash;
  return %hash;
}
sub test4{
  my %hash;
  return %hash if wantarray;
  return \%hash;
}
sub test5{
  my %hash;
  return $hash{ qw'one two three' } if wantarray;
  return \%hash;
}
{
  package test;
  use Devel::Caller qw'called_as_method';
  sub test6{
    my $out;
    if( wantarray ){
      $out = 'list';
    }else{
      $out = 'scalar';
    }
    $out = "call in $out context";
    if( called_as_method ){
      $out = "method $out";
    }else{
      $out = "simple function $out";
    }
    return $out;
  }
}

I can see possibly using many of these in future project, but some of them are rather pointless.

Upvotes: 1

Tom Alsberg
Tom Alsberg

Reputation: 7033

I am not sure if returning a reference is more efficient in this case; i.e. does Perl copy data returned by subroutines?

In general, if your array is constructed entirely within the subroutine then there is no obvious problem with returning a reference because otherwise the array would be discarded anyway. However if the reference is also passed elsewhere before returning it, you may have two copies of the same reference and it may be modified in one place but not expected to elsewhere.

Upvotes: 0

Mathieu Longtin
Mathieu Longtin

Reputation: 16710

No. Except do "return $result;" for clarity.

I remember testing the efficiency of those, and the difference in performance was minimal for small arrays. For large arrays, returning a reference was way faster.

It's really a convenience thing for small result. Would you rather do this:

($foo,$bar) = barbaz();

Or returning a reference:

 $foobar = barbaz();
 $foobar->[0]; # $foo
 $foobar->[1]; # $bar

Another way to return a reference:

($foo,$bar) = @{barbaz()};

As a rule, once you decide which way to go, just keep to it for you module, since it makes it confusing to switch from one method to the next.

I typically return array references for lists of similar things, and an array when the response is composed of two to four different elements. More than that, I make a hash, since not all caller will care about all the response elements.

Upvotes: 8

Hynek -Pichi- Vychodil
Hynek -Pichi- Vychodil

Reputation: 26121

When you are used to use code as first snippet in Mathieu Longtin answer you have to write ugly code as second snippet or this not so much better code:

my ($foo,$bar) = @{barbaz()};

I think this is the biggest drawback when returning reference instead of array. If I want return small amount of different kind values. I'm used to return array and assign directly to variables (as used to do in Python for example).

my ($status, $result) = do_something();
if ($status eq 'OK') {
    ...

If amount of values is bigger and various kind I'm used to return hash ref (better for refactoring)

my ($status, $data, $foo, $bar, $baz) =
    @{do_something()}{qw(status data foo bar baz)};
if ($status eq 'OK') {
    ...

If return values are of same kind, than returning of array or array ref is debatable depending of amount.

Upvotes: 0

Related Questions