user3817378
user3817378

Reputation: 131

Remove duplicate hash from perl array

I have the perl array as below

my @arr = ({
  CONTEXTID => 1230,
  NAME => 'test8824',
  PROVIDERID => 163
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 77
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 779
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 141
}, {
  CONTEXTID => 1230,
  NAME => 'test8824',
  PROVIDERID => 163
})

I want to remove the duplicate hashes from the array , The output should be like this:

({
  CONTEXTID => 1230,
  NAME => 'test8824',
  PROVIDERID => 163
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 77
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 779
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 141
}
)

The duplicate will be identified only when all the keys of hash are matching else its not duplicate.

Upvotes: 2

Views: 863

Answers (3)

user3817378
user3817378

Reputation: 131

In one of the stackoverflow answer I got this solution which works for me , I don't remember the original post thought.

 my %seen;
 my @array;
 @array =  grep { my $e = $_; my $key = join '___', map { $e->{$_}; } sort keys %$_;!$seen{$key}++ } @array;

put your array of href inside the array variable and returned array will be having unique hash values.

Upvotes: 2

Polar Bear
Polar Bear

Reputation: 6798

Please verify that following piece satisfies your requirements

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my @result;
my %seen;

my @arr = ({
  CONTEXTID => 1230,
  NAME => 'test8824',
  PROVIDERID => 163
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 77
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 779
}, {
  CONTEXTID => 8824,
  NAME => 'test8824',
  PROVIDERID => 141
}, {
  CONTEXTID => 1230,
  NAME => 'test8824',
  PROVIDERID => 163
});

foreach my $el ( @arr ) {
    my $k = join('|', @$el{qw/CONTEXTID NAME PROVIDERID/ });
    push @result, $el unless $seen{$k};
    $seen{$k} = 1;
}

print Dumper(\@result);

Output:

$VAR1 = [
          {
            'PROVIDERID' => 163,
            'CONTEXTID' => 1230,
            'NAME' => 'test8824'
          },
          {
            'NAME' => 'test8824',
            'CONTEXTID' => 8824,
            'PROVIDERID' => 77
          },
          {
            'CONTEXTID' => 8824,
            'PROVIDERID' => 779,
            'NAME' => 'test8824'
          },
          {
            'NAME' => 'test8824',
            'CONTEXTID' => 8824,
            'PROVIDERID' => 141
          }
        ];

Upvotes: 1

ikegami
ikegami

Reputation: 385655

The following is a common idiom for removing duplicates:

my %seen;    
my @unique = grep !$seen{$_}++, @strings;

That use a string comparisons to determine if two items are identical or not. That will not do in our case (as that would effectively compare the addresses of the hashes, finding them all unique).

But we can easily generalize the above as follows:

my %seen;    
my @unique = grep !$seen{key($_)}++, @items;

All we need now is a function key that produces a string such that the following conditions are true:

  • key($a) ne key($b) if $a is considered to be different than $b.
  • key($a) eq key($b) if $a is considered to be the same as $b.

In this case, we could use the following:

use feature qw( state );

use Cpanel::JSON::XS qw( );

sub key {
   state $encoder = Cpanel::JSON::XS->new->canonical;
   return $encoder->encode($_[0]);
}

Upvotes: 4

Related Questions