Reputation: 285
I have some data that looks like this:
G1 G2 G3 G4
Pf1 NO B1 NO D1
Pf2 NO NO C1 D1
Pf3 A1 B1 NO D1
Pf4 A1 NO C1 D2
Pf5 A3 B2 C2 D3
Pf6 NO B3 NO D3
My purpose is to check in each column if an element (different from the "NO" cases) is showed twice (like A1 in column 2, for example) and only twice (if it is showed three times or more I don't want it in the output) and, if so, write it as correspondenting to the element of the first column. Of course, I will have more elements of the columns corresponding to an element of the first column. So, the desired output looks like this:
Pf1 B1
Pf2 C1
Pf3 A1 B1
Pf4 A1 C1
Pf5 D3
Pf6 D3
I have a code, that work in the opposite direction. It lists the elements of the first column that correspond to the elements that are showed twice and only twice in the other columns. This code looks like this:
use Data::Dumper;
my %hash;
while (<DATA>) {
next if $.==1;
chomp;
my ($first,@others) = (split /\s+/);
for (@others){
$hash{$_}.=' '.$first;
}
}
print Dumper \%hash;
I need to be pushed in order to adapt it to my new purpose. Any help or suggestion is totally welcome!
Upvotes: 0
Views: 49
Reputation: 50647
my %hash;
my @r;
while (<DATA>) {
next if $.==1;
chomp;
my @t = grep $_ ne "NO", split;
push @r, \@t;
$hash{$_}++ for @t[1 .. $#t];
}
for my $l (@r) {
my $k = shift @$l;
my @t = grep { $hash{$_} ==2 } @$l;
print "$k @t\n";
}
__DATA__
G1 G2 G3 G4
Pf1 NO B1 NO D1
Pf2 NO NO C1 D1
Pf3 A1 B1 NO D1
Pf4 A1 NO C1 D2
Pf5 A3 B2 C2 D3
Pf6 NO B3 NO D3
output
Pf1 B1
Pf2 C1
Pf3 A1 B1
Pf4 A1 C1
Pf5 D3
Pf6 D3
Upvotes: 1