user3360439
user3360439

Reputation: 389

perl to add the results of 2 arrays from 2 differnt files together

I run a report between 2 csv files, the last bit i wish to do check is to add matching elemants of the 2 arrays (built up of unique values and occurances) together. but i can't work out how to do a for each matching name in each both arrays add together, to get the output as below.

INPUT:

jon  22          
james  12         
ken    22            
jack    33         
jim     11         
harry    7
dave     9
grant    12
matt     74
malc     12

INPUT1:

jon  2    
james  1         
ken    8           
jack    5         
jim     1        
harry    51
dave     22

Desired Output:

jon  24       
james  13     
ken    30           
jack    38  
jim     12     
harry    58
dave     31
grant    12
matt     74
malc     12 

code i have so to create oput from INPUT and INPUT1

my %seen;
seek INPUT, 0, 0;
while (<INPUT>)

{
    chomp;
    my $line = $_;
    my @elements = split (",", $line);
    my $col_name = $elements[1];
    #print "    $col_name  \n" if ! 
    $seen{$col_name}++;
}

while ( my ( $col_name, $times_seen ) = each %seen ) {
    my $loc_total = $times_seen * $dd;
    print "\n";

    print "     $col_name \t\t :  = $loc_total";

    printf OUTPUT "%-34s = %15s\n", $col_name , " $loc_total ";

}
##############                          ###################

my %seen2;
seek INPUT1, 0, 0;
while (<INPUT1>)
{
    chomp;
    my $line = $_;
    my @elements1 = split (",", $line);
    my $col_name = $elements1[1];
    my $col_type = $elements1[5];

    $seen2{$col_name}++ if $col_type eq "YES";
}

while ( my ( $col_name, $times_seen2 ) = each %seen2 ) {
    my $loc_total = $times_seen2 ;

    print "\n    $col_name \t\t= $loc_total";
    printf OUTPUT "%-34s = %15s\n", $col_name , $times_seen2 ;
}

close INPUT;

Upvotes: 0

Views: 40

Answers (3)

Miller
Miller

Reputation: 35198

The following could easily be adapted to just take file names from the command line instead.

Maintains the order of the keys in your file:

use strict;
use warnings;
use autodie;

my @names;
my %total;

local @ARGV = qw(INPUT INPUT1);
while (<>) {
    my ($name, $val) = split;
    push @names, $name if ! exists $total{$name};
    $total{$name} += $val;
}

for (@names) {
    print "$_ $total{$_}\n";
}

Upvotes: 0

rje
rje

Reputation: 274

First, I'll assume that the input files are actual CSV files -- whereas your examples are just whitespace delimited. In other words:

jon,22          
james,12         
ken,22            
jack,33         
jim,11         
harry,7
dave,9
grant,12
matt,74
malc,12

and

jon,2    
james,1         
ken,8           
jack,5         
jim,1        
harry,51
dave,22

ASSUMING I'm correct, then your while loops will do the trick, with a couple of tweaks:

  1. The first element of your @elements arrays have index 0, not 1. So the "key" here is at $elements[0], and the "value" is at $elements[1]. So you'd have something like:

    my $col_name = $elements[0];

    my $col_value = $elements[1];

  2. Instead of incrementing %seen, it seems more useful to add the value, like so:

    $seen{ $col_name } += $col_value;

  3. In your while loop which iterates over INPUT1, do the same thing done in the first loop to extract data; also, don't use %seen2; instead, simply add to %seen as above:

    my $col_name = $elements1[0];

    my $col_value = $elements1[1];

    $seen{$col_name} += $col_value;

  4. Your totals will then be stored in %seen, so your final while loop is slightly modified:

while ( my ( $col_name, $times_seen2 ) = each %seen ) { # instead of %seen2

If your two processing loops are identical (and I see it's possible that they're not), then I'd suggest factoring them into a common subroutine. But that's a different matter.

Upvotes: 0

choroba
choroba

Reputation: 241868

Instead of using %seen, store the running total in the hash directly:

#!/usr/bin/perl
use warnings;
use strict;

my %count;
for my $file ('INPUT', 'INPUT1') {
    open my $IN, '<', $file or die "$file: $!";
    while (<$IN>) {
        my ($name, $num) = split;
        $count{$name} += $num;
    }
}

for my $name (sort { $count{$b} <=> $count{$a} } keys %count) {
    print "$name\t$count{$name}\n";
}

Upvotes: 1

Related Questions