Reputation: 327
This question is very common but I have little bit different condition. I have 10 files and I want to extract common rows. I found ->
perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' file1 file2 file3 file4
or in linux ->
comm [-1] [-2] [-3 ] file1 file2
But if file has 3 columns (or more columns) and I want to compare only first 2 columns (or more) and not the last column->
file1 ->
Col1 col2 col3 A 1 0 A 2 1
file2
Col1 col2 col3 A 2 0.5 A 1 10 B 1 10
desired output ->
Col1 col2 file1 file2 A 1 0 10 A 2 1 0.5
So in output, there should be 10 more columns if I have 10 files. Is it also possible as one liner perl (by modifying it) or what can we do?
Upvotes: 0
Views: 1675
Reputation:
use strict;
use warnings;
use Array::Utils qw(intersect);
my $first_file=shift(@ARGV);
my @common_lines=();
#Grab all of the lines in the first file.
open(my $read,"<",$first_file) or die $!;
while(<$read>)
{
chomp;
my @arr=split /\t/;
@arr=@arr[0,1]; #Only take first two columns.
push @common_lines,join("\t",@arr);
}
close($read);
foreach my $file (@ARGV)
{
my @matched_lines=();
open($read,"<",$file) or die $!;
while(<$read>)
{
chomp;
my @arr=split /\t/;
@arr=@arr[0,1];
my $to_check=join("\t",@arr);
#If $to_check is in @common_lines, put it in @matched_lines
if(grep{$_ eq $to_check}@common_lines)
{
push @matched_lines,$to_check;
}
}
close($read);
#Take out elements of @common_lines that aren't in @matched_lines
@common_lines=intersect(@common_lines,@matched_lines);
unless(@common_lines)
{
print "No lines are common amongst the files!\n";
}
}
foreach(@common_lines)
{
print "$_\n";
}
Upvotes: 1