Reputation: 425
I have the following data in a hash of hash of arrays. The numerical data in the array represents that last 4 quarters of financial information.
I'd like to be able to iterate over the array, and pull out the data by quarter, to get it ready to insert into a database. With my code below I can get all of the quarters, or one quarter only if I specifically call it. When I try and add another loop to iterate over the hash of hash of arrays to return only the array subset values, i get all the values and I don't know what I'm doing wrong. See code: Thanks for the help
my %comp_info = (
CompanyA => {
COGS => ["175.00", "155.00", "125.00", "125.00"],
Revenue => ["300.00", "200.00", "250.00", "225.00"],
},
)
# The following works, but I have to specifically push one array subset at a time,
# which makes passing values to db_insert subroutine ineffective.
# (Id have to have 4 or 5 calls depending on number of quarters of data in each record).
sub get_insert_arrays {
foreach $comp (keys %comp_info ) {
foreach $column ( keys %{$comp_info{$comp}} ) {
push (@insert_array, @{$comp_info{$sym}{$column}}[0] );
}
}
my $valuelist = join(", ", @insert_array);
&db_insert($valuelist);
undef @insert_array;
}
#try to loop through, fails, I get all of the data in @insert_array instead of only one quarter.
sub get_insert_arrays {
foreach $comp (keys %comp_info ) {
foreach $column ( keys %{$comp_info{$comp}} ) {
for my $i ( 0 .. $#{$comp_info{$comp}{$column}} ) {
push (@insert_array, @{$comp_info{$sym}{$column}}[$i] );
}
}
my $valuelist = join(", ", @insert_array);
&db_insert($valuelist);
undef @insert_array;
undef $valuelist;
}
}
Upvotes: 1
Views: 212
Reputation: 126722
Your recent addition - undef @insert_array; undef $valuelist;
is a misuse of undef
. Putting it before a variable like that forces a garbage collection cycle, which is something you don't want to do -- it is best to let Perl look after things itself.
Arrays should be emptied with @insert_array = ()
rather than using undef
, and for scalars you should $valuelist = undef
. But these variables are irrelevant outside the subroutine, so you should declare them inside, in which case there is no need to reinitialise them in the first place.
Please bear in mind what I said about calling prepare
on an SQL statement with placeholders. Your code should look something like this
my $insert = $dbh->prepare('INSERT INTO table VALUES (?, ?)');
and later
my @insert_array = (175.00, 300.00);
$insert->execute(@insert_array);
However I have written this, which I think does what you want, to create a $valuelist
string as your own code does. Since you don't need the hash keys, it is much tidier to iterate over the values instead. The db_insert
subroutine is a dummy that just prints the values of the parameter passed to it.
use strict;
use warnings;
use 5.010;
my %comp_info = (
CompanyA => {
COGS => ["175.00", "155.00", "125.00", "125.00"],
Revenue => ["300.00", "200.00", "250.00", "225.00"],
},
);
my @values = map values %$_, values %comp_info;
for my $i (0 .. $#{$values[0]}) {
my @insert = map $_->[$i], @values;
db_insert(join ', ', @insert);
}
sub db_insert {
say "db_insert('@_')";
}
output
db_insert('175.00, 300.00')
db_insert('155.00, 200.00')
db_insert('125.00, 250.00')
db_insert('125.00, 225.00')
Update
To comply with the new specification:
use strict;
use warnings;
use 5.010;
my %comp_info = (
CompanyA => {
COGS => ["175.00", "155.00", "125.00", "125.00"],
Revenue => ["300.00", "200.00", "250.00", "225.00"],
},
CompanyB => {
COGS => ["175.00", "155.00", "125.00", "125.00"],
Revenue => ["300.00", "200.00", "250.00", "225.00"],
},
CompanyC => {
COGS => ["175.00", "155.00", "125.00", "125.00"],
Revenue => ["300.00", "200.00", "250.00", "225.00"],
},
);
my @columns = qw/ COGS Revenue /;
for my $comp (keys %comp_info) {
my $data = $comp_info{$comp};
for my $i (0 .. $#{(values %$data)[0]}) {
my @values = ( $comp, map $_->[$i], @{$data}{@columns} );
db_insert(join ', ', @values);
}
}
sub db_insert {
say "db_insert('@_')";
}
output
db_insert('CompanyC, 175.00, 300.00')
db_insert('CompanyC, 155.00, 200.00')
db_insert('CompanyC, 125.00, 250.00')
db_insert('CompanyC, 125.00, 225.00')
db_insert('CompanyA, 175.00, 300.00')
db_insert('CompanyA, 155.00, 200.00')
db_insert('CompanyA, 125.00, 250.00')
db_insert('CompanyA, 125.00, 225.00')
db_insert('CompanyB, 175.00, 300.00')
db_insert('CompanyB, 155.00, 200.00')
db_insert('CompanyB, 125.00, 250.00')
db_insert('CompanyB, 125.00, 225.00')
Upvotes: 1
Reputation: 107040
I highly recommend to dereference with intermediate variables, and also using the ->
syntax. Both of these help you figure out what's going on:
Here's your first subroutine using dereferencing:
sub get_insert_arrays {
my @insert_array;
foreach $comp (keys %comp_info ) { # References to a hash (Revenue, COGS)
%columns = %{ $comp_info{$comp} }; # Dereference
foreach $column ( keys %columns ) { # Reference to an Array (quarter data)
my @values = @{ $column } ; # Dereference
push (@insert_array, $column );
my $valuelist = join(", ", @insert_array);
&db_insert($valuelist);
}
}
}
Hmm... Looking at this, it's easy to see that I can simply do:
for my $comp (keys %comp_info ) { # References to a hash (Revenue, COGS)
%columns = %{ $comp_info{$comp} }; # Dereference
for my $column ( keys %columns ) { # Reference to an Array (quarter data)
my @values = @{ $column } ; # Dereference
db_insert(@values);
}
}
}
If you need to see a particular piece of data, use the ->
syntax to simplify your structure:
${$comp_info{$sym}{$column}}[$i]; # You had "@". Should be "$".
vs.
$comp_info{$sym}->{$column}->[$i];
Much easier to read.
Also use the warnings
and strict
pragmas in your program. It'll catch a lot of errors that may include undefined variables, and misspelled variable names.
If you're pulling out data quarter-by-quarter, you probably want the COGS to be with the Revenue column:
#! /usr/bin/env perl
#
use strict; # Lets you know when you misspell variable names
use warnings; # Warns of issues (using undefined variables
use feature qw(say);
#
# Just initializing the data you had
#
my %comp_info = ( CompanyA => {
Revenue => [
300.00, 200.00, 250.00, 225.00
],
COGS => [
175.00, 155.00, 125.00, 125.00
],
},
);
#
# Go through each company
#
for my $company ( keys %comp_info ) {
my @revenues = @{ $comp_info{$company}->{Revenue} }; # Dereference
my @cogss = @{ $comp_info{$company}->{COGS} }; # Dereferenec
say "Company: $company";
#
# I know the keys are "Revenue" and "COGS", so I don't need a loop.
# I'll just go right to my quarters data. Note that dereferencing
# makes my program a lot easier to write and maintain
#
for my $quarter ( (0..3) ) {
my $revenue = $revenues[$quarter];
my $cogs = $cogss[$quarter];
my $profit = $revenue - $cogs;
say " Quarter: " . ($quarter - 1)
. " Revenue = $revenue COGS = $cogs Profit = $profit";
}
}
How you do your database inserts is up to you. But, you can see how doing a bit of dereferencing and using ->
can clarify what you're looking at.
how do I pull out just the quarter by quarter data without having to specify revenue, cogs, etc. in some cases there could be 30+ fields, so I don't want to have to specify each field in the program. I just want to grab all Q1 fields, insert, grab all Q2 field, insert, etc.
So two loops:
%comp_info
hash.%comp_info
hash is a reference to another hash that is keyed by the data type (COGS, Revenue, etc.). Again, I simply loop through the keys of that inner hash (after dereferencing to make it easier to understand). %comp_info
hash, and a list of the keys in that inner hash, I can simply pull up the first quarter numbers for each company and each data type. Getting the quarter value is simple: $comp_info{$company}->{$type}->[$quarter]
. Note that there is three levels of data, and I have three sections in my variable with each section separated by ->
.
$comp_info{$company}
). %comp_info
hash points to a hash reference (->{type}
) which is keyed by the data types (COGS, Revenue, etc.).->[$quarter]
). See how that works and why I like that ->
syntax? It makes it very clear what I am working with.for my $quarter (0..3) {
.Here's what it looks like. This is a complete program, so you could cut this out, and try running it yourself and see if you can figure out what's going on.
use strict; # Lets you know when you misspell variable names
use warnings; # Warns of issues (using undefined variables
use feature qw(say);
my $quarter = 0; #First Quarter is 0. Last quarter is 3
my %comp_info = ( CompanyA => {
Revenue => [
300.00, 200.00, 250.00, 225.00
],
COGS => [
175.00, 155.00, 125.00, 125.00
],
},
);
for my $company ( keys %comp_info ) {
say "Company: $company";
%types = %{ $company_info{$company} };
for my $type ( keys %types ) { # COGS, Revenue, etc.
say " $type for quarter "
. ($quarter + 1) . ": "
. $comp_info{$company}->{$type}->[$quarter];
}
}
One more time for each quarter to insert the data into your database:
Using use strict
and declaring variables with my
means that variables are only valid for a limited scope. That my @type_data;
declares a array that holds my type values for inserting into your database. However, since it's declared inside that for my $quarter
loop, the array and its values disappears with each iteration of the loop. No need to have to remove the data or reinitialize the variable. It does it all by itself!
Look up how Lexically Scoped Variables work.
for my $quarter ( (0..3) ) {
my @type_values;
for my $company ( keys %comp_info ) {
my %types = %{ $comp_info{$company} };
for my $type ( keys %types ) { # COGS, Revenue, etc.
push @type_values, $comp_info->{$company}->{$type}->{quarter};
}
insert_data( @type_values ); # Database insert you wanted
}
Upvotes: 1