vkk05
vkk05

Reputation: 3222

Perl extract columns from two files based on condition

I have 2 files, say file1 and file2.

file1.txt

RAC1 GK1 111
RAC2 GK2 222
RAC1 GK3 333
RAC1 GK4 222
RAC2 GK5 111

file2.txt

R1,PAAE,222,TESTA,COLA,NO
R2,RWWG,111,TESTB,COLM,YES
R3,TDAS,444,TESTC,COLZ,NO

I am comparing 2 files and trying to extract data from them. Condition here is if Column3 value of file1 matches with Column3 value of file2 then print the following output -

RAC1,GK1,111,R2,RWWG,TESTB,COLM,YES
RAC2,GK5,111,R2,RWWG,TESTB,COLM,YES
RAC2,GK2,222,R1,PAAE,TESTA,COLA,NO
RAC1,GK4,222,R1,PAAE,TESTA,COLA,NO

I have written a script for the same, by taking file1 column2 value as key. But this column value doesn't exists in file2. So comparison is not working.

Even I am not able to take column3(from file1) as key, because its having duplicated values.

Code below -

my %hash1 = ();

open(FH1, "file1.txt");

while(<FH1>){
    chomp($_);
    my @val = split(' ', $_);

    $hash1{$val[1]}{'RAC_VAL'}   = $val[0];
    $hash1{$val[1]}{'ID'} = $val[2];
}

#print Dumper(\%hash1);


open(FH2, "file2.txt");
while(<FH2>){
    chomp($_);
    my @array = split(',', $_);
    print "$hash1{$array[2]}{'RAC_VAL'},,$hash1{$array[2]}{'ID'},$array[0],$array[1],$array[3],$array[4],$array[5]\n" if(exists $hash1{$array[2]}{'ID'});   
}

Please help me to get output for above data files based on the above said condition.

Upvotes: 0

Views: 72

Answers (1)

H&#229;kon H&#230;gland
H&#229;kon H&#230;gland

Reputation: 40718

Here is an example using array of arrays as values in %hash1 (since the keys are not unique):

use feature qw(say);
use strict;
use warnings;

my %hash1;
open(FH1, "file1.txt");
while(<FH1>){
    chomp($_);
    my @val = split(' ', $_);
    push @{ $hash1{$val[2]} }, [ @val[0,1] ];
}

open(FH2, "file2.txt");
while(<FH2>){
    chomp($_);
    my @array = split(',', $_);
    if ( exists $hash1{$array[2]} ) {
        for my $item ( @{ $hash1{$array[2]} } ) {
            say join ',',  @$item, @array[0,1,3,4,5];
        }
    }
}

Output:

RAC2,GK2,R1,PAAE,TESTA,COLA,NO
RAC1,GK4,R1,PAAE,TESTA,COLA,NO
RAC1,GK1,R2,RWWG,TESTB,COLM,YES
RAC2,GK5,R2,RWWG,TESTB,COLM,YES

Upvotes: 1

Related Questions