user9259206
user9259206

Reputation:

Combination of two columns from text file and finding count with reference to third column in Perl

I am trying to read the in.txt file and generate an output file out.txt Using Perl. I tried with Hashes but not getting exact output.

Is there a way to do this in Perl.

Combination of two columns and providing comments on the basis of third column.

in.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y

out.txt

Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0

Upvotes: 1

Views: 144

Answers (3)

Andrey
Andrey

Reputation: 1818

This works as well:

use strict;
use warnings;

my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
    chomp $line;
    next if($line =~ /Account/);
    my @line = split ',', $line;
    $data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
    $data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
    $data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
    print $FH "Account,Template,Active,NotActive\n";
    foreach my $key (sort keys %data) {
        foreach my $key2 (sort keys %{$data{$key}}) {
            print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
        }
    }
close $FH;

You can also replace these two lines

$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});

with

$data{$line[1]}{$line[0]}{$_} //= 0 foreach ('Y', 'N');

Upvotes: 3

rav
rav

Reputation: 34

my $filename = 'input.txt';
my %yhash;
my %nhash;
if (open(my $ifh, '<:encoding(UTF-8)', $filename)) {
    while (my $row = <$ifh>) {
    next if ($row =~ /^#/m);
    chomp $row;
    my @values = split(',',$row);
    my $value = join '',@values ;
    my $lastchar = substr $value , -1;
    my $firstval = substr $value ,0,9;
    if ($lastchar eq "N"){
              if (exists($nhash{firstval})){ $nhash{firstval}++; }
              $nhash{$firstval}++;
    }elsif($lastchar eq "Y"){
              if (exists($yhash{firstval})){ $yhash{firstval}++; }
              $yhash{$firstval}++;
    }else{
             print "nothin\n";

    }
    }
    close $ifh;
    } else {
    warn "Could not open file '$filename' $!";
    }


   open(FH, '>', 'out.txt') or die $!;
   print FH "Account,Template,Active,NotActive\n";
   while (my ($key, $value) = each(%nhash)) {
    my $account = substr $key ,6,3;
    my $template = substr $key ,0,6;
    my $active = "0";
    my $notactive = "$value";
    print FH "$account,$template,$active,$notactive \n";
   }
  while (my ($key, $value) = each(%yhash)) {
    my $account = substr $key ,6,3;
    my $template = substr $key ,0,6;
    my $active = "$value";
    my $notactive = "0";
    print FH "$account,$template,$active,$notactive \n";
  }
  close (FH);

Upvotes: 0

Allan
Allan

Reputation: 12456

This is not a perl solution, but this works fine with awk:

AWK 1-liner:

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if($3=="Y"){a[$2 FS $1]++}else{b[$2 FS $1]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n

AWK script:

# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ($3 == "Y") { # if the 3rd field is at Y
                a[$2 FS $1]++ #increment the array  indexed by $2 FS $1 
        } else {
                b[$2 FS $1]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output

DEMO:

Input:

$ cat input_file 
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y

output:

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0

Upvotes: -1

Related Questions