Yash
Yash

Reputation: 3114

perl code to remove duplicate entries from a file

I have a file (say bugs.txt) which is generated by running some code. This file has list of JIRAS. I want to write a code which can remove duplicate entries from this file.

The logic should be generic as the bugs.txt file will be different everytime.

sample input file bugs.txt:

BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221

sample output:

BUG-111, BUG-122, BUG-123, JIRA-221, JIRA-234

My trial code:

my $file1="/path/to/file/bugs.txt";
my $Jira_nums;
open(FH, '<', $file1) or die $!;
  {
    local $/;
    $Jira_nums = <FH>;
  }
close FH;

I need help in designing the logic for removing duplicate entries from the file bugs.txt

Upvotes: 0

Views: 474

Answers (2)

Toto
Toto

Reputation: 91385

You just need to add these lines to your script:

my %seen;
my @no_dups = grep{!$seen{$_}++}split/,?\s/,$Jira_nums;

You'll get:

use strict;
use warnings;
use Data::Dumper;

my $file1="/path/to/file/bugs.txt";
my $Jira_nums;
open(my $FH, '<', $file1) or die $!; # use lexical file handler
  {
    local $/;
    $Jira_nums = <$FH>;
  }
my %seen;
my @no_dups = grep{!$seen{$_}++}split/,?\s/,$Jira_nums;
say Dumper \@no_dups;

For input data like:

BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221

it gives:

$VAR1 = [
          'BUG-111',
          'BUG-122',
          'BUG-123',
          'JIRA-221',
          'JIRA-234'
        ];

Upvotes: 1

ssr1012
ssr1012

Reputation: 2589

You can try this:

use strict;
use warnings;

my @bugs = "";
@bugs =  split /\,?(\s+)/, $_ while(<DATA>);
my @Sequenced = map {$_=~s/\s*//g; $_} RemoveDup(@bugs);

print "@Sequenced\n";

sub RemoveDup {     my %checked;   grep !$checked{$_}++, @_;  }


__DATA__
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221

Upvotes: 0

Related Questions