Reputation: 25
I have 2 files.
For example, the content of file #1 is:
hi1
hi2
hi4
… of file #2 is:
hi1
hi4
hi3
hi5
I would like to sort out these documents so that a third file would contain just:
hi2
hi3
hi5
Can anyone toss me in the right direction? I'm in dire need! Perl is wanted, but C/C++ is accepted.
Upvotes: 0
Views: 2063
Reputation: 3744
Count each line, then print out the ones where the count is one:
#!/usr/bin/perl
use warnings;
use strict;
local @ARGV = ('file.1', 'file.2');
my %lines;
while (<>) {
$lines{$_}++;
}
print sort grep $lines{$_} == 1, keys %lines;
Upvotes: 0
Reputation: 9188
I know you asked for perl or C, but in Unix (or with MKS or equivalent Unix on Windows toolkit):
sort file1 file2 | uniq -u > file3
It doesn't get much simpler than that.
Upvotes: 4
Reputation: 1055
Here's a quick bit of code to do what you want. There's no error checking, and I'm assuming that your text files are not so huge that you'll run out of memory by loading all the text into a hash array.
open(FILE1, "< file1.txt");
open(FILE2, "< file2.txt");
@file1 = <FILE1>;
@file2 = <FILE2>;
foreach $line (@file1, @file2)
{
chomp($line);
$TEXT{$line}++;
}
foreach $line (sort keys %TEXT)
{
if ($TEXT{$line} == 1)
{
print $line . "\n";
}
}
Upvotes: 2
Reputation: 903
Still not sure you are describing the problem completely. hi3 is not duplicated, but hi4 is. So should the output contain hi3 instead of hi4? Hint: to detect duplicates in perl, you probably want to use a hash.
Upvotes: -1