Chris Simmons
Chris Simmons

Reputation: 259

Add Multiple values together when a condition is met?

My mind seems to be missing a few screws today. I have an issue that I'm baffled by, but to be fair, I'm new to Perl scripting.

I am opening a csv file and need to look for duplicate values in one column, and where there are duplicates in this column, I need to add all values from another column for each duplicate together and print it on a new line in a new file.

open(my $feed, '<', $rawFile) or die "Could not locate '$rawFile'\n";
open(OUTPUT, '>', $newFile) or die "Could not locate '$newFile'\n";
while(my $line = <$feed>) {
    chomp $line;

    my @columns = split /,/, $line;
    $Address= $columns[1];
    $forSale= $columns[3];

}

I understand how to open the file and read it line by line. I know how to print results to new file. What I'm having trouble with is building logic to say, "For each Address in this extract that're duplicates, add all of their forSale's up and print the Address in new file with the added forSale's values. I hope this makes sense. Any assistance at all is encouraged.

Upvotes: 0

Views: 88

Answers (2)

Thanos
Thanos

Reputation: 1778

Hello Chris Simmons,

I would like to add a few minor modification(s) on the perfect answer that Sobrique provided you.

You can open a file on the way you did but also you can open multiple files on the command line e.g. test.pl sample1.csv sample2.csv, you can read about it here eof.

I would also choose to check the file if it contains comma character (,) else print on terminal that this line can not be parsed.

Next step after splitting all values in the array I would trim the string(s) for white space leading and trailing.

Having said all that see solution bellow:

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my %hash;
while (<>) {
    chomp;
    if (index($_, ',') != -1) {
    my @fields = split(/,/);
    # remove leading and trailing white space
    s{^\s+|\s+$}{}g foreach @fields;
    $hash{$fields[0]} += $fields[3];
    }
    else {
    warn "Line could not be parsed: $_\n";
    }
} continue {
    close ARGV if eof;
}
print Dumper \%hash;

__END__

$ perl test.pl sample.csv
$VAR1 = {
          '123 6th St.' => 3,
          '71 Pilgrim Avenue' => 5
        };

__DATA__

123 6th St., Melbourne, FL 32904, 2
71 Pilgrim Avenue, Chevy Chase, MD 20815, 5
123 6th St., Melbourne, CT 06074, 1

Since you did not provide us sample of input data I created my own.

Another possible way is to use the module Text::CSV as ikegami proposed. Sample of code with the same checks that I mentioned earlier, see bellow:

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
use Data::Dumper;

my $csv = Text::CSV->new({ sep_char => ',' });

my %hash;
while (<>) {
    chomp;
    if ($csv->parse($_)) {
    my @fields = $csv->fields();
    # remove leading and trailing white space
    s{^\s+|\s+$}{}g foreach @fields;
    $hash{$fields[0]} += $fields[3];
    } else {
    warn "Line could not be parsed: $_\n";
    }
} continue {
    close ARGV if eof;
}
print Dumper \%hash;

__END__

$ perl test.pl sample.csv
$VAR1 = {
          '123 6th St.' => 3,
          '71 Pilgrim Avenue' => 5
        };

__DATA__

123 6th St., Melbourne, FL 32904, 2
71 Pilgrim Avenue, Chevy Chase, MD 20815, 5
123 6th St., Melbourne, CT 06074, 1

Hope this helps.

BR / Thanos

Upvotes: 1

Sobrique
Sobrique

Reputation: 53478

The tool you need for this job is a hash.

This will allow you to 'key' things by Address:

my %sum_of;

while(my $line = <$feed>) {
    chomp $line;

    my @columns = split /,/, $line;
    $Address= $columns[1];
    $forSale= $columns[3];

    $sum_of{$Address} += $forSale; 

}

foreach my $address ( sort keys %sum_of ) {
    print "$address => $sum_of{$address}\n";
}

Upvotes: 3

Related Questions