Reputation: 4068
I'm searching for a way to reduce the following piece of code to a single regexp statement:
if( $current_value =~ /(\d+)(MB)*/ ){
$current_value = $1 * 1024 * 1024;
}
elsif( $current_value =~ /(\d+)(GB)*/ ){
$current_value = $1 * 1024 * 1024 * 1024;
}
elsif( $current_value =~ /(\d+)(KB)*/ ){
$current_value = $1 * 1024;
}
The code performs an evaluation of the value that can be expressed as a single number (bytes), a number and KB (kilobytes), with megabytes (MB) and so on. How do I reduce the block of code?
Upvotes: 3
Views: 327
Reputation: 62236
use warnings;
use strict;
use Number::Format qw(format_bytes);
print format_bytes(1024), "\n";
print format_bytes(2535116549), "\n";
Output:
1K
2.36G
Upvotes: 5
Reputation: 34130
There is a problem with using KB
for 1024 bytes. Kilo as a prefix generally means 1000 of a thing not 1024.
The problem gets even worse with MB
since it has meant 1000*1000
, 1024*1024
, and 1000*1024
.
A 1.44 MB floppy actually holds 1.44 * 1000 * 1024
.
The only real way out of this is to use the new KiB
(Kibibyte) to mean 1024 bytes.
The way you implemented it also has the limitation that you can't use 8.4Gi
to mean 8.4 * 1024 * 1024
. To remove that limitation I used $RE{num}{real}
from Regexp::Common instead of \d+
.
Some of the other answers hardwire the match by writing out all of the possible matches. That can get very tedious, not to mention error prone. To get around that I used the keys of %multiplier
to generate the regex. This means that if you add or remove elements from %multiplier
you won't have to modify the regex by hand.
use strict;
use warnings;
use Regexp::Common;
my %multiplier;
my $multiplier_match;
{
# populate %multiplier
my %exponent = (
K => 1, # Kilo Kibi
M => 2, # Mega Mebi
G => 3, # Giga Gibi
T => 4, # Tera Tebi
P => 5, # Peta Pebi
E => 6, # Exa Exbi
Z => 7, # Zetta Zebi
Y => 8, # Yotta Yobi
);
while( my ($str,$exp) = each %exponent ){
@multiplier{ $str, "${str}B" } = (1000 ** $exp) x2; # K KB
@multiplier{ "${str}i", "${str}iB" } = (1024 ** $exp) x2; # Ki KiB
}
# %multiplier now holds 32 pairs (8*4)
# build $multiplier_match
local $" #" # fix broken highlighting
= '|';
my @keys = keys %multiplier;
$multiplier_match = qr(@keys);
}
sub remove_multiplier{
die unless @_ == 1;
local ($_) = @_;
# s/^($RE{num}{real})($multiplier_match)$/ $1 * $multiplier{$2} /e;
if( /^($RE{num}{real})($multiplier_match)$/ ){
return $1 * $multiplier{$2};
}
return $_;
}
If you absolutely need 1K to mean 1024 then you only need to change one line.
# @multiplier{ $str, "${str}B" } = (1000 ** $exp) x2; # K KB
@multiplier{ $str, "${str}B" } = (1024 ** $exp) x2; # K KB
Note that since I used $RE{num}{real}
from Regexp::Common it will also work with 5.3e1Ki
.
Upvotes: 0
Reputation: 84348
You could set up a hash like this:
my %FACTORS = ( 'KB' => 1024, 'MB' => 1024**2, 'GB' => 1024**3 );
And then parse the text like this:
if ( $current_value =~ /(\d+)(KB|MB|GB)/ ) {
$current_value = $1 * $FACTORS{$2};
}
In your example the regex has a *
which I'm not sure you intend, because *
means "zero or more" and so (+\d)(MB)*
would match 10
or 10MB
or 10MBMB
or 10MBMBMBMBMBMBMB
.
Upvotes: 4
Reputation: 2857
You can do it in one regexp, by putting code snippits inside the regexp to handle the three cases differently
my $r;
$current_value =~ s/
(\d+)(?:
Ki (?{ $r = $^N * 1024 })
| Mi (?{ $r = $^N * 1024 * 1024 })
| Gi (?{ $r = $^N * 1024 * 1024 * 1024 })
)/$r/xso;
Upvotes: 1
Reputation: 39773
Using benzado's modified code, here is a test you can run to see if it works.
We advise you to always put code like this in a reusable method, and write a small unit-test for it:
use Test::More;
plan tests => 4;
##
# Convert a string denoting '50MB' into an amount in bytes.
my %FACTORS = ( 'KB' => 1024, 'MB' => 1024*1024, 'GB' => 1024*1024*1024 );
sub string_to_bytes {
my $current_value = shift;
if ( $current_value =~ /(\d+)(KB|MB|GB)/ ) {
$current_value = $1 * $FACTORS{$2};
}
return $current_value;
}
my $tests = {
'50' => 50,
'52KB' => 52*1024,
'55MB' => 55*1024*1024,
'57GB' => 57*1024*1024*1024
};
foreach(keys %$tests) {
is( string_to_bytes($_),$tests->{$_},
"Testing if $_ becomes $tests->{$_}");
}
Running this gives:
$ perl testz.pl
1..4
ok 1 - Testing if 55MB becomes 57671680
ok 2 - Testing if 50 becomes 50
ok 3 - Testing if 52KB becomes 53248
ok 4 - Testing if 57GB becomes 61203283968
Now you can
And voilá!
Upvotes: 1