Reputation: 67
I have two files.
For example, the content of file #1 is:
dynSamp/dgenExp
dynSamp/dgenLod
dynSamp/dgenStm
dynSamp/dgenUpd
dynSamp/dmlnodExp
dynSamp/dmlnodLod
dynSamp/dmlnodStm
dynSamp/dmlnodUpd
dynSamp/dmndynLod
dynSam/dmndynStm
dynSamp/dmndynUpd
sample/genExp
sample/genLod
sample/genStm
sample/genUpd
sample/mlnodExp
sample/mlnodLod
sample/mlnodStm
sample/mlnodUpd
sample/mndynLod
sample/mndynStm
sample/mndynUpd
sample/genLod
dynSamp/dgenLod
dynSamp/dmlnodLod
dynSamp/dmndynLod
sample/mndynLod
sample/mlnodLod
And the content of file #2 is:
dynSamp/dgenExp
dynSamp/dgenLod
dynSamp/dgenStm
dynSamp/dgenUpd
dynSamp/dmlnodStm
dynSamp/dmndynStm
dynSamp/dthrdsUpd_unix
dynSamp/dthrdsUpd_win
sample/genExp
sample/genLod
sample/genStm
sample/genUpd
sample/mlnodStm
sample/mndynStm
sample/thrdsUpd_unix
sample/thrdsUpd_win
sample/genLod
dynSamp/dgenLod
dynSamp/dmndynStm
dynSamp/dthrdsUpd_win
I would like to sort out these two file. The result should be the unique contents of first file minus the unique/duplicate contents of second file.
The following should be all that remains of file #:
dynSamp/dmlnodExp
dynSamp/dmlnodLod
dynSamp/dmlnodUpd
dynSamp/dmndynLod
dynSamp/dmndynUpd
sample/mlnodExp
sample/mlnodLod
sample/mlnodUpd
sample/mndynLod
sample/mndynUpd
Can anyone please help me in sorting out this? Thanks!
Upvotes: 0
Views: 1274
Reputation: 385657
You didn't ask any question, so I presume you are having problems coming up with an algorithm. Here's one:
This algorithm preserves the order of the records of the first file.
Since it's rather trivial to code it, I might as well provide that too.
my %skip;
{
open(my $fh, '<', $ARGV[1])
or die("Can't open \"$ARGV[1]\": $!\n");
while (<$fh>) {
chomp;
++$skip{$_};
}
}
{
open(my $fh, '<', $ARGV[0])
or die("Can't open \"$ARGV[0]\": $!\n");
while (<$fh>) {
chomp;
print "$_\n" if !$skip{$_}++;
}
}
Usage:
script file1 file2 >file.out
Or sorted:
script file1 file2 | sort >file.out
Upvotes: 3
Reputation: 67221
its a bit straight forward in awk with sort:
awk 'FNR==NR{a[$0];next}{if(!($0 in a))print $0}' temp2 temp | sort -u
and i think dynSam/dmndynStm,
should also be included in your output according to your requirement.
> awk 'FNR==NR{a[$0];next}{if(!($0 in a))print $0}' temp2 temp | sort -u
dynSam/dmndynStm,
dynSamp/dmlnodExp,
dynSamp/dmlnodLod,
dynSamp/dmlnodUpd,
dynSamp/dmndynLod,
dynSamp/dmndynUpd,
sample/mlnodExp,
sample/mlnodLod,
sample/mlnodUpd,
sample/mndynLod,
sample/mndynUpd,
>
Upvotes: 0
Reputation: 10470
I think you want something like this ...
dogface@computer ~
$ cat sortit.pl
#!/usr/bin/perl -w
use strict;
my $file1 = 'file1';
my $file2 = 'file2';
my %bad;
my %good;
open BAD, "<$file2";
while (<BAD>) {
chomp;
$bad{$_} = 1;
}
close BAD;
open GOOD, "<file1";
while( <GOOD> ) {
chomp;
next if $bad{$_};
$good{$_} = 1;
}
close GOOD;
open OUT, ">file3";
foreach my $key ( keys %good ) {
print OUT $key . "\n";
}
close OUT;
dogface@computer ~
$ cat file1
dynSamp/dgenExp
dynSamp/dgenLod
dynSamp/dgenStm
dynSamp/dgenUpd
dynSamp/dmlnodExp
dynSamp/dmlnodLod
dynSamp/dmlnodStm
dynSamp/dmlnodUpd
dynSamp/dmndynLod
dynSam/dmndynStm
dynSamp/dmndynUpd
sample/genExp
sample/genLod
sample/genStm
sample/genUpd
sample/mlnodExp
sample/mlnodLod
sample/mlnodStm
sample/mlnodUpd
sample/mndynLod
sample/mndynStm
sample/mndynUpd
sample/genLod
dynSamp/dgenLod
dynSamp/dmlnodLod
dynSamp/dmndynLod
sample/mndynLod
sample/mlnodLod
dogface@computer ~
$ cat file2
dynSamp/dgenExp
dynSamp/dgenLod
dynSamp/dgenStm
dynSamp/dgenUpd
dynSamp/dmlnodStm
dynSamp/dmndynStm
dynSamp/dthrdsUpd_unix
dynSamp/dthrdsUpd_win
sample/genExp
sample/genLod
sample/genStm
sample/genUpd
sample/mlnodStm
sample/mndynStm
sample/thrdsUpd_unix
sample/thrdsUpd_win
sample/genLod
dynSamp/dgenLod
dynSamp/dmndynStm
dynSamp/dthrdsUpd_win
dogface@computer ~
$ ./sortit.pl
dogface@computer ~
$ cat file3
sample/mndynLod
dynSamp/dmlnodUpd
dynSamp/dmlnodLod
dynSamp/dmlnodExp
sample/mndynUpd
sample/mlnodUpd
sample/mlnodLod
dynSamp/dmndynLod
dynSamp/dmndynUpd
sample/mlnodExp
dynSam/dmndynStm
dogface@computer ~
$
Oh if you want file3 sorted, use the following instead:
foreach my $key ( sort keys %good ) {
print OUT $key . "\n";
}
Upvotes: 0