Reputation: 1
This is frustrating. I have 2 text file that are just a phone number per line. I need to read the first line from file1, and search file2 for a match. If there is a no match, write the line value to an output file. I've been trying this but I know its wrong.
$file1 = 'pokus1.txt';
$file2 = 'pokus2.txt';
open (F1, $file1) || die ("Could not open $file1!");
open (F2, $file2) || die ("Could not open $file2!");
open (OUTFILE, '>>output\output_x1.txt');
@f1data = <F1>;
@f2data = <F2>;
while (@f1data){
@grp = grep {/$f1data/} @f2data;
print OUTFILE "$grp";
}
close (F1);
close (F2);
close (OUTFILE);
I hope someone can help? Thanks Brent
Upvotes: 0
Views: 2513
Reputation: 107090
Whenever you get a is one piece of data in one group in another group type question (and they come up quite a bit, you should think in terms of hashes.
A hash is a keyed lookup. Let's say you create a hash keyed on say... I don't know... phone numbers taken from file #1. If you read a line in file #2, you can easily see if it's in file #1 by simply looking at the hash. Fast, efficient.
use strict; #ALWAYS ALWAYS ALWAYS
use warnings; #ALWAYS ALWAYS ALWAYS
use autodie; #Will end the program if files you try to open don't exist
# Constants are a great way of storing data that is ...uh... constant
use constant {
FILE_1 => "a1.txt",
FILE_2 => "a2.txt",
};
my %phone_hash;
open my $phone_num1_fh, "<", FILE_1;
#Let's build our phone number hash
while ( my $phone_num = <$phone_num1_fh> ) {
chomp $phone_num;
$phone_hash{ $phone_num } = 1; #Doesn't really matter, but best not a zero value
}
close $phone_num1_fh;
#Now that we have our phone hash, let's see if it's in file #2
open my $phone_num2_fh, "<", FILE_2;
while ( my $phone_num = <$phone_num2_fh> ) {
chomp $phone_num;
if ( exists $phone_hash { $phone_num } ) {
print "$phone_num is in file #1 and file #2";
}
else {
print "$phone_num is only in file #2";
}
}
See how nicely that works. The only issue is that there may be phone numbers in file #1 that aren't in file #2. You could solve this by simply creating a second hash for all the phone numbers in file #2.
Let's do this one more time with two hashes:
my %phone_hash1;
my %phone_hash2;
open my $phone_num1_fh, "<", FILE_1;
while ( my $phone_num = <$phone_num1_fh> ) {
chomp $phone_num;
$phone_hash1{ $phone_num } = 1;
}
close $phone_num1_fh;
open my $phone_num2_fh, "<", FILE_2;
while ( my $phone_num = <$phone_num2_fh> ) {
chomp $phone_num;
$phone_hash2{ $phone_num } = 1;
}
close $phone_num1_fh;
Now, we'll use keys to list the keys and go through them. I'm going to create an %in_common
hash when the phone is in both hashes
my %in_common;
for my $phone ( keys %phone_hash1 ) {
if ( $phone_hash2{$phone} ) {
$in_common{$phone} = 1; #Phone numbers in common between the two lists
}
}
Now, I have three hashes %phone_hash1
, %phone_hash2
, and %in_common
.
for my $phone ( sort keys %phone_hash1 ) {
if ( not $in_common{$phone} ) {
print "Phone number $phone is only in the first file\n";
}
}
for my $phone ( sort keys %phone_hash2 ) {
if ( not $in_common{$phone} ) {
print "Phone number $phone is only in " . FILE_2 . "\n";
}
}
for my $phone ( sort keys %in_common ) {
print "Phone number $phone is in both files\n";
}
Note in this example, I didn't use the exists to see if the key exists in the hash. That is, I simply put if ( $phone_hash2{$phone} )
instead of if ( exists $phone_hash2{$phone} )
. The first form checks to see if the key is defined -- even if the value is a null string or numerically zero.
The second form will be true as long as the value is not zero, a null string, or undefined. Since I purposefully set the value of the hash to 1
, I can use this form. It's a good habit to use exists
because there will be a situation where a valid value could be a null string or zero. However, some people like the way the code reads without using the exists
when possible.
Upvotes: 1
Reputation: 1482
bash :
grep -vf file1 file2 > file3
grep -f file1 file2 > file4
Upvotes: 2
Reputation: 36282
A customary solution where you process one file saving its data as keys of a hash and later process the other looking if that key exists:
#!/usr/bin/env perl
use warnings;
use strict;
my (%phone);
open my $fh1, '<', shift or die;
open my $fh2, '<', shift or die;
##open my $ofh, '>>', shift or die;
while ( <$fh2> ) {
chomp;
$phone{ $_ } = 1;
}
while ( <$fh1> ) {
chomp;
next if exists $phone{ $_ };
##printf $ofh qq|%s\n|, $_;
printf qq|%s\n|, $_;
}
exit 0;
Run it like:
perl script.pl file1 file2 > outfile
Upvotes: 1