Comparision of 2 Files using Perl

Question

Currently i have a requirement for comparing 2 tables each table having around 700M + records.

We have come up with a idea to compare in the File based rather than taking it to the DB , based upon our DB performance. Also heard that Perl is much faster than any other coding method

We need to compare for a Mobile number the Usage plan in one File to the same mobile Number in another File, either if present and its usage is matching or not , we need to write in a new file only the Not matching records.

Example:

File 1

number, Usage type , Usage Plan , Usage Volume (KB)  ........
12344 , CP         , FB         , 100  ........
12323 , UP         , FB         , 200  ........
12322 , CP         , G+         , 300  ........

File 2

number, Usage type , Usage Plan , Usage Volume (KB)  ........
12344 , CP         , FB         , 100  ........
12323 , UP         , FB         , 210  ........

So in the above case my unmatch File should contain

Unmatch File

12323 , UP         , FB         , 210  ........
12322 , CP         , G+         , 300  ........

........ Means there are many columns after this, which we will not be using to compare. They are rather like more details about the plan.

Please share your suggestions and coding idea on this.

Our aim is to complete the comparison within 6 - 7 hours.. so that loading and other things can be completed within 2 days..

Thanks in Advance.. Sam

Aldo · Accepted Answer

How about this:

use strict;

open FILE1, 'file1.txt';
open FILE2, 'file2.txt';
open OUTPUT, '>output.txt';

my $regex = qr/^ *(\d+) , (.*) , (.*) , (\d+)/;

my $file1;
while(){
  if(/$regex/){
    $file1->{$1}->{type} = $2;
    $file1->{$1}->{plan} = $3;
    $file1->{$1}->{volume} = $4;
  }
}

my $file2;
while(){
  if(/$regex/){
    $file2->{$1}->{type} = $2;
    $file2->{$1}->{plan} = $3;
    $file2->{$1}->{volume} = $4;
  }
}

my $numbers;
$numbers->{$_} = 1 foreach keys %$file1;
$numbers->{$_} = 1 foreach keys %$file2;

my $output;
foreach(keys %$numbers){
  if(defined $file1->{$_} && defined $file2->{$_}){
    if($file1->{$_}->{type} ne $file2->{$_}->{type} || $file1->{$_}->{plan} ne $file2->{$_}->{plan} || $file1->{$_}->{volume} ne $file2->{$_}->{volume}){
      push @$output, [$_, $file2->{$_}->{type}, $file2->{$_}->{plan}, $file2->{$_}->{volume}];
    }
  }elsif(defined $file1->{$_}){
    push @$output, [$_, $file1->{$_}->{type}, $file1->{$_}->{plan}, $file1->{$_}->{volume}];
  }else{
    push @$output, [$_, $file2->{$_}->{type}, $file2->{$_}->{plan}, $file2->{$_}->{volume}];
  }
}

print OUTPUT join(' , ', @$_)."
" foreach @$output;

Comparision of 2 Files using Perl

Answers (2)

Related Questions