user740521
user740521

Reputation: 1204

Explain Perl code to display a number of bytes in KB, MB, GB etc

Given a number of bytes it formats it into "bytes", "KB", "MB", or "GB"... but what I don't understand is the portion:

$_->[1], $_->[0]

isn't what's being passed to map just an array of hashes? So how can there be a 0 and 1 index?

sub fmt {
    my $bytes = shift;

    return (
        sort { length $a <=> length $b                         } 
        map  { sprintf '%.3g%s', $bytes/1024**$_->[1], $_->[0] } 
        [" bytes"=>0],[KB=>1],[MB=>2],[GB=>3]
    )[0];
}

Upvotes: 9

Views: 1252

Answers (5)

Borodin
Borodin

Reputation: 126752

That is one awful piece of code. Someone's showing off

The list passed to map is this: a list of anonymous arrays

[ " bytes" => 0 ], [ KB => 1 ], [ MB => 2 ], [ GB => 3 ]

While the fat comma operator => is often seen in the context of a hash literal, that's not all it's good for. It's identical to an ordinary comma , except that a bareword left-hand operand will be implicitly quoted. Without it the list would be the same as

[ ' bytes', 0 ], [ 'KB', 1 ], [ 'MB', 2 ], [ 'GB', 3 ]

Here's the same function with the result of the intermediate map statement expanded into a separate array @variations, which I dump using Data::Dump to show what it's doing

The list passed to map is a number of anonymous arrays--each one containing the suffix string and the corresponding power of 1024 to which that string corresponds. The return statement simply picks the shortest of the representations

use strict;
use warnings 'all';
use feature 'say';

use Data::Dump;

say fmt(987 * 1024**2);

sub fmt {
        my $bytes = shift;

        my @variations = map { sprintf '%.3g%s', $bytes/1024 ** $_->[1], $_->[0] }
            [ " bytes" => 0 ],
            [ KB => 1 ],
            [ MB => 2 ],
            [ GB => 3 ];

        dd \@variations;

        return ( sort { length $a <=> length $b } @variations ) [0];
}

output

["1.03e+009 bytes", "1.01e+006KB", "987MB", "0.964GB"]
987MB

I normally use something similar to this. The antics with sprintf are to make sure that fractions of a byte are never displayed

sub fmt2 {
    my ($n) = @_;
    my @suffix = ( '', qw/ K M G T P E / );

    my $i = 0;
    until ( $n < 1024 or $i == $#suffix ) {
        $n /= 1024;
        ++$i;
    }

    sprintf $i ? '%.3g%sB' : '%.0f%sB', $n, $suffix[$i];
}

Upvotes: 7

Ben Grimm
Ben Grimm

Reputation: 4371

With a tiny bit of math, this can be done without any iteration or cleverly constructed arrays:

my @si_prefix = ('', qw( K M G T P E Z Y ));
sub fmt {
  my $bytes = shift or return '0B';
  my $pow = int log(abs $bytes)/log(1024);
  return sprintf('%3.3g%sB', $bytes / (1024 ** $pow), $si_prefix[$pow]);
}

We can easily determine the closest power of 1024 by using the logarithm base change rule: log1024($bytes) = log($bytes) / log(1024)

Just for fun, I ran Benchmark::cmpthese using the code from the question, @Borodin's fmt2, and my version:

Benchmarking 1B
                 Rate    fmt_orig fmt_borodin         fmt
fmt_orig     245700/s          --        -76%        -84%
fmt_borodin 1030928/s        320%          --        -34%
fmt         1562500/s        536%         52%          --

Benchmarking 7.45GB
                 Rate    fmt_orig fmt_borodin         fmt
fmt_orig     224215/s          --        -66%        -84%
fmt_borodin  653595/s        192%          --        -54%
fmt         1428571/s        537%        119%          --

Benchmarking 55.5EB
                 Rate    fmt_orig fmt_borodin         fmt
fmt_orig     207469/s          --        -57%        -83%
fmt_borodin  487805/s        135%          --        -60%
fmt         1219512/s        488%        150%          --

Upvotes: 5

Sobrique
Sobrique

Reputation: 53498

"No, those aren't hashes.

[" bytes"=>0],[KB=>1],[MB=>2],[GB=>3]

Isn't a hash at all. => is basically just a comma. So this is

[" bytes", 0],["KB",1],["MB",2],["GB",3]

Which is a bit of an odd way of doing it, because it's sort of constructing a hash.

So I'd write this more like:

sub fmt2 {
   my ($bytes) = @_;
   my @units = qw ( bytes KB MB GB TB );
   #divide by 1024 whilst above
   while ( $bytes > 1024 ) {
      shift(@units); #drop one of the units prefixes until one 'fits'. 
      $bytes /= 1024;
   }
   return sprintf( '%.3g%s', $bytes, $units[0] );
}

Upvotes: 3

Paul L
Paul L

Reputation: 938

No, it is not being passed an array of hashes. It is being passed a list of array references, where each referenced array contains exactly two elements.

The => operator in Perl is also known as the "fat comma" operator. It is identical to the , operator except that it automatically quotes the left hand argument. It is most commonly used with hashes and hash references, because keys are so often strings and it's useful to have a visual indication that the key and value are linked in some way, but it is not a requirement.

[" bytes"=>0],[KB=>1],[MB=>2],[GB=>3]

is exactly the same as

[" bytes", 0],['KB', 1],['MB', 2],['GB', 3]

Simply using the => instead of , operator does not make a two-element array into a hash. If you wanted to pass map a list of hash references, you'd have to change all the [ ] to { }

{" bytes"=>0},{KB=>1},{MB=>2},{GB=>3}

More info on the fat comma operator

More info on Perl references

Upvotes: 1

Tudor Constantin
Tudor Constantin

Reputation: 26861

map { sprintf '%.3g%s', $bytes/1024**$_->[1], $_->[0] } [" bytes"=>0],[KB=>1],[MB=>2],[GB=>3]

the above part returns a list of strings formatted with that sprintf().

$_->[1] represents the number from the right of " bytes", KB, MB, etc, while $_->[0] represents the " bytes", KB, MB, etc string

The whole function returns the string that has the biggest length among all the ones formatted with that sprintf().

Upvotes: 0

Related Questions