user1316667
user1316667

Reputation: 23

How can I organize this data using Perl?

I am new to Perl. I have an input file such as:

a 7 5
b 8 2    
a 3 2   
b 4 1    
c 6 1

How can I get output like

column_1_val, number_occurrence_column_1, sum_of_column_2, sum_of_column_3

For example

a 2 10 7
b 2 12 3
c 1 6 1

Upvotes: 2

Views: 169

Answers (4)

Axeman
Axeman

Reputation: 29854

A slightly different take:

my %records;

while ( <> ) {
    my @cols = split ' ';
    my $rec  = $records{ $cols[0] } ||= {};
    $rec->{number_occurrence_column_1}++;
    $rec->{sum_of_column_2} += $cols[1];
    $rec->{sum_of_column_3} += $cols[2];
}

foreach my $rec ( map { { col1 => $_, %{ $records{ $_ } } } 
          sort keys %records 
        ) { 
    print join( "\t"
              , @$rec{ qw<col1 number_occurrence_column_1 
                          sum_of_column_2 sum_of_column_3
                         > 
                     } 
              ), "\n"
       ;
}

Upvotes: 0

rsp
rsp

Reputation: 23373

That would be something like (untested):

while (<>) {
    if (m/(\w+)\s+(\d+)\s+(\d+)/) {
        ($n, $r1, $r2) = ($1, $2, $3);

        $nr{$n}++;
        $r1{$n} += $r1;
        $r2{$n} += $r2;
    }
}

for $n (sort keys %nr) {

    print "$n $nr{$n} $r1{$n} $r2{$n}\n";
}

This is a very quick-and-dirty way of doing what you described, but it should get you on your way.

Upvotes: 1

Borodin
Borodin

Reputation: 126722

The program below is a possible solution. I have used the DATA file handle whereas you will presumably need to open an external file and use the handle from that.

use strict;
use warnings;

use feature 'say';

my %data;

while (<DATA>) {
  my ($key, @vals) = split;
  $data{$key}[0]++;
  my $i;
  $data{$key}[++$i] += $_ for @vals;
}

say join ' ', $_, @{$data{$_}} for sort keys %data;

__DATA__
a 7 5
b 8 2    
a 3 2   
b 4 1    
c 6 1

output

a 2 10 7
b 2 12 3
c 1 6 1

Upvotes: 2

Vijay
Vijay

Reputation: 67221

Even i am not aware of perl.But in case you are concerned with the result.the below is the solution in awk.It might /might not help you.but in case u need it :

awk '{c[$1]++;a[$1]=a[$1]+$2;b[$1]+=$3}END{for(i in a)print i,c[i],a[i],b[i]}' file3

Upvotes: 0

Related Questions