Reputation: 89
I want to find duplicate Arrays from hash that contains arrays. Point is, I am trying to develop sets and storing them into hash table of Perl. After, I need to extract 1. those arrays which are completely duplicate(Having all values same). 2. Intersection of arrays
Source code is given as under:
use strict;
use warnings;
my @test1= ("Bob", "Flip", "David");
my @test2= ("Bob", "Kevin", "John", "Michel");
my @test3= ("Bob", "Flip", "David");
my @test4= ("Haidi", "Bob", "Grook", "Franky");
my @test5= ();
my @test6=();
my %arrayHash= ( "ppl1" => [@test1],
"ppl2"=> [@test2],
"ppl3" => [@test3],
"ppl4"=> [@test4],
"ppl5"=> [@test5],
"ppl6"=> [@test6],
);
Required Output: ppl1 and ppl3 have duplicate lists
Intersection of arrays= Bob
Kindly note that duplication of empty arrays is not desired!
Upvotes: 2
Views: 261
Reputation: 3535
You need to check two arrays for equality for the hash keys.For that you can use smart match operator for comparison.
Next you can use grep
to filter-out values which are not duplicates and a hash to keep track of values which are already checked.
#!/usr/bin/perl
use strict;
use warnings;
my @test1= ("Bob", "Flip", "David");
my @test2= ("Kevin", "John", "Michel");
my @test3= ("Bob", "Flip", "David");
my @test4= ("Haidi", "Grook", "Franky");
my @test5= ("Bob", "Flip", "David");
my @test6= ("Kevin", "John", "Michel");
my @test7= ("Haidi", "Grook", "Frank4");
my %arrayHash= ( "ppl1" => [@test1],
"ppl2"=> [@test2],
"ppl3" => [@test3],
"ppl4"=> [@test4],
"ppl5"=> [@test5],
"ppl6"=> [@test6],
"ppl7"=> [@test7]
);
my %seen;
foreach my $key1 (sort keys %arrayHash){
next unless @{$arrayHash{$key1}};
my @keys;
if(@keys=grep{(@{$arrayHash{$key1}} ~~ @{$arrayHash{$_}} ) && ($_ ne $key1) && (not exists $seen{$key1})}sort keys %arrayHash){
unshift(@keys,$key1);
print "@keys are duplicates \n";
@seen{@keys}=@keys;
}
}
output:
ppl1 ppl3 ppl5 are duplicates
ppl2 ppl6 are duplicates
Upvotes: 0
Reputation: 53478
So there's a set of steps here:
compare your arrays one to the other. This is harder because you're doing multi-element arrays. You can't directly test equivalence, because you need to compare members.
Filter one from the other.
So first of all:
(Edit: Coping with empty)
#!/usr/bin/env perl
use strict;
use warnings;
my @test1 = ( "Bob", "Flip", "David" );
my @test2 = ( "Kevin", "John", "Michel" );
my @test3 = ( "Bob", "Flip", "David" );
my @test4 = ( "Haidi", "Grook", "Franky" );
my @test5 = ();
my @test6 = ();
my %arrayHash = (
"ppl1" => [@test1],
"ppl2" => [@test2],
"ppl3" => [@test3],
"ppl4" => [@test4],
"ppl5" => [@test5],
"ppl6" => [@test6],
);
my %seen;
#cycle through the hash
foreach my $key ( sort keys %arrayHash ) {
#skip empty:
next unless @{ $arrayHash{$key} };
#turn your array into a string - ':' separated
my $value_str = join( ":", sort @{ $arrayHash{$key} } );
#check if that 'value string' has already been seen
if ( $seen{$value_str} ) {
print "$key is a duplicate of $seen{$value_str}\n";
}
$seen{$value_str} = $key;
}
Now note - this is a bit of a cheat - it sticks together your arrays with :
, which doesn't work in every scenario.
("Bob:", "Flip")
and ("Bob", ":Flip")
will end up the same.
It will also only print your most recent duplicate if you have multiple.
You can work around this - if you want - by pushing multiple values into the %seen
hash.
Upvotes: 1
Reputation: 1412
use strict;
use warnings;
my @test1= ("Bob", "Flip", "David");
my @test2= ("Kevin", "John", "Michel");
my @test3= ("Bob", "Flip", "David");
my @test4= ("Haidi", "Grook", "Franky");
my %arrayHash= ( "1" => \@test1,
"2"=> \@test2,
"3" => \@test3,
"4"=> \@test4,
);
sub arrayCmp {
my @array1 = @{$_[0]};
my @array2 = @{$_[1]};
return 0 if ($#array1 != $#array2);
@array1 = sort(@array1);
@array2 = sort(@array2);
for (my $ii = 0; $ii <= $#array1; $ii++) {
if ($array1[$ii] ne $array2[$ii]) {
#print "$array1[$ii] != $array2[$ii]\n";
return 0;
}
}
return 1;
}
my @keyArr = sort(keys(%arrayHash));
for(my $i = 0; $i <= $#keyArr - 1; $i++) {
my @arr1 = @{$arrayHash{$keyArr[$i]}};
for(my $j = 1; $j <= $#keyArr; $j++) {
my @arr2 = @{$arrayHash{$keyArr[$j]}};
if ($keyArr[$i] ne $keyArr[$j] && arrayCmp(\@arr1, \@arr2) == 1) {
print "$keyArr[$i] and $keyArr[$j] are duplicates\n";
}
}
}
Outputs this
1 and 3 are duplicates
Upvotes: 0