With two arrays, I need to check and see if the elements of one appear in the other one, and print the matched elements separately

Question

F1.txt

bob
tom
harry

F2.txt

bob
a=1   b=2   c=3
bob
d=4   e=5   f=6
tom
a1=34  b1=32  c1=3443
tom
a2=534  b2=732  c2=673443

result:

A1.txt

bob
a=1   b=2   c=3
bob
d=4   e=5   f=6

A2.txt

tom
a1=34  b1=32  c1=3443
tom
a2=534  b2=732  c2=673443

I am new to PERL can you kindly help me out in my problem.Now above I have mentioned 2 files namely F1.txt and F2.txt, my job is to search for any element of F1.txt in F2.txt and print the corresponding line along with the next line. If one element is found then the resultant must be saved in a new file,I have given the example A1.txt, it stores all information about bob and similarly A2.txt it stores all information about Tom. Till now i have tried this code, but it is not working efficiently,

use strict; 
use warnings;  

my $line1; 
my $line2; 
my $fh; 
my $fh1; 
my $counter;  

open  $fh, "<", "F1.txt" or die $!; 
open  $fh1, "<", "F2.txt" or die $!;  

my @b = <$fh>; 
my @a = <$fh1>;  

for (@b) 
{     
  $line1 = $_;     

  for (@a)     
  {         
    $line2 = $_;         
    if ($line1 =~ /^$line2$/)         
    { 
      $counter++;             
      open my $outfile, ">>", "A_${counter}.txt";             
      print $outfile $line2;             
      close $outfile;         
    }
  }
}

David W. · Accepted Answer

Whenever you are checking for repeat elements, think hash. For example, let's say you have two files:

 File #1      File #2
 Bob          Tom
 Ted          Dick
 Alice        Harry
 Carol        Ted

If your job is to find the name in File #2 that's also in File #1, you could store the names in File #1 in a hash, and then as you go through File #2, see if there's any names that match in your hash.

First, let's read in File #1:

 use strict;
 use warnings;
 use autodie;  #This way, I don't have to check open statements

 open my $file_1, "<", "file_1";
 my %first_file_name_hash;
 while my $name (<$file_1>) {
    chomp $name;
    $first_file_name_hash{$name} = 1;
 }
 close $file_1;

Now, %first_file_name_hash contains all of the names in File #1.

Now let's open File #2, and go through that:

open my $file_2, "<" "file_2";
while my $name (<$file_2>) {
   if ($first_file_name_hash) {
       print "User is in file #1 and file #2
";
   }
}
close $file_2;

Yes, this isn't quite what you wanted, but it gives you a good idea how to store hashes.

A hash has a key that's associated with a value. Each entry must have a unique key. However, each entry in a hash could have duplicate vales. Here's a simple hash:

 $hash{BOB} = "New York";
 $hash{CAROL} = "New York";
 $hash{TED} = "Los Angeles";
 $hash{ALICE} = "Chicago";

In the above, both $hash{BOB} and $hash{CAROL} have the same value (New York). However, there can only be a single BOB or CAROL in the hash.

The big advantage with a hash is that it's very easy to access an element by the key. You know the key, you can easily bring up the element.

In your case, I would use two hashes. In the first hash, I would save the names of everyone from the first file in $HASH_1. I would save the names of everyone in the second file in $HASH_2. Not only that, but I would make the value of $HASH_2 the next line in the file.

This will give you:

$HASH_1{bob} = 1;
$HASH_1{tom} = 1;
$HASH_1{harry} = 1; 

$HASH_2{bob} = a=1  b=2  c=3
               d=4  e=5  f=6
$HASH_2{tom} = a1=34  b1=32  c1=3443
               a2=534   b2=732   c2=673443

NOTE: There's a limit of a single value for each entry in a hash, so when you have two or more lines with a key of bob, you have to figure out a way to handle them. In this case, if the key already exists in $HASH_2, I simply append it following a NL value.

In modern Perl, you could store an array in a hash, but you're a beginning Perl programmer, so we'll stick with the simpler tricks.

Here's a completely untested program:

use strict;
use warnings;
use autodie;
use feature qw(say);   #Better print that print

# Read in File #1
open my $file_1, "<", "F1.txt";
my %hash_1;
while my $name (<$file_1>) {
   chomp $name;
   $hash_1{$name} = 1;
}
close $file_1;

# Read in File #2 -- a bit trickier

open my $file_2, "<", "F2.txt";
my %hash_2;
while my $name (<$file_2>) {
    chomp $name;
    my $value = <$file_2>;              #The next line
    chomp $value;
    next if not exists $hash_1{$name};  #Not interested if it's not in File #1
    if (exists $hash_2{$name}) {        #We've seen this before!
        $hash_2{$name} = $hash_2{$name} . "
" . $value; #Appending value
    }
    else {
        $hash_2{$name} = $value;
    }
 }
 close $file_2;

Now, we have the data the way we want it. %hash_2 contains everything you need. I'm just not sure how you want to print it out. However, it would go something like this:

 my $counter = 1;    #Used for file numbering..
 foreach my $name (sort keys %hash_2) {
    open my $file, ">", "A" . $counter . "txt";
    say $file "$name";     #Name of person
    my @lines = split /
/, $hash_2{$key};  #Lines in our hash value
    foreach my $line (@lines) {
      say $file "$line";
    }
    close $file;
    $counter++;
 }

Notice by using hashes, I avoid the double for loop which can end up eating up a lot of time. I only go through three loops: The first two read in each file. The last one goes through the second file's hash, and prints it out.

Using hashes is a great way to track the data you've already read.

With two arrays, I need to check and see if the elements of one appear in the other one, and print the matched elements separately

Answers (1)

Related Questions