user1527107
user1527107

Reputation: 49

comparing the contents of 2 lines of a file and adding them using perl

I have a file called listofvalues.txt. The file has over 1000 lines and 5 columns.

1,232,3434,54343,434343  
1,232,100,4546,3456  
1,122,45454,4546,3456  
2,212,334,5555,4654  
...  
...  

I want to add up the values of the third column if column 1 and 2 are equal and print the result into a file like as follows

1,232,3534,54343,434343  
1,122,45454,4546,3456  
2,212,334,5555,4654  
....  
.........  
.........  
......  

how do you think I can do it in Perl? Since I am new to Perl I am finding it hard to do.

Upvotes: 0

Views: 213

Answers (4)

Chris Charley
Chris Charley

Reputation: 6613

You could try the database approach, although it doesn't deal with col4 or col5.

#!/usr/bin/perl
use strict;
use warnings;
use DBI;

my $dbh = DBI->connect("DBI:CSV:");
$dbh->{'csv_tables'}->{'data'} = { 'file' => 'o33.txt',
    'col_names' => [qw/col1 col2 col3 col4 col5/]};

my $sql = <<SQL;
select col1, col2, SUM(col3)
from data
group by col1, col2
order by col1, col2
SQL

my $sth = $dbh->prepare( $sql );
$sth->execute;

{
    local $" = ',';
    while ( my $row = $sth->fetchrow_arrayref ) {
        print "@$row\n";
    }
}

__END__
C:\Old_Data\perlp>type o33.txt
1,232,3434,54343,434343
1,232,100,4546,3456
1,122,45454,4546,3456
2,212,334,5555,4654

C:\Old_Data\perlp>perl t3.pl
1,122,45454
1,232,3534
2,212,334

Upvotes: 0

Thor
Thor

Reputation: 47169

Here is another one-linerish:

  perl -F, -lane '
  BEGIN { $, = "," }

  if(defined(@A)) {
    if($A[0] == $F[0] and $A[1] == $F[1]) {
      $A[3] +=  $F[3];
    } else {
      print @A;
      @A = (@F);
    }
  } else {
    @A = (@F);
  }

  END { print @A }' listofvalues.txt

See perlrun(1) for the implications of switches.

Upvotes: 1

Borodin
Borodin

Reputation: 126742

This program works by maintaining an array @data containing a list of all records with unique column1|column2 keys. The first time a new key is encountered in the file the complete record is pushed onto the stack. Each subsequent encounter just adds the third field of the record to the original value.

The hash %data maintains references to the element of @data corresponding to each different value of the key.

use strict;
use warnings;

open my $fh, '<', 'listofvalues.txt' or die $!;

my @data;
my %data;

while (<$fh>) {
  chomp;
  my @record = split /,/;
  my $key = join '|', @record[0,1];
  if ($data{$key}) {
    $data{$key}[2] += $record[2];
  }
  else {
    push @data, ($data{$key} = \@record);
  }
}

print join(',', @$_), "\n" for @data;

output

1,232,3534,54343,434343  
1,122,45454,4546,3456  
2,212,334,5555,4654

Update

A one-line solution

perl -F, -ane '$k="@F[0,1]";$s{$k}?$s{$k}[2]+=$F[2]:do{push@d,$k;$s{$k}=[@F]};END{$\"=',';print"@{$s{$_}}"for@d}' listofvalues.txt

Upvotes: 3

Zaid
Zaid

Reputation: 37146

Just because you can do it in a one-liner doesn't mean you should ;)

$ perl -F',' -lane '
push @order, [ @F[0,1] ]
  unless $seen{$F[0]}{$F[1]}++; # Preserve order
$total{$F[0]}{$F[1]} += $F[2];  # Sum up
$value{$F[0]}{$F[1]} = join ',' => @F[0,1], $total{$F[0]}{$F[1]}, @F[3..$#F];
} END {
    print $value{$_->{0]}{$_->[1]} for @order;
' file.txt

Upvotes: 0

Related Questions