Reputation: 3114
I have a file (say bugs.txt
) which is generated by running some code. This file has list of JIRAS. I want to write a code which can remove duplicate entries from this file.
The logic should be generic as the bugs.txt file will be different everytime.
sample input file bugs.txt
:
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
sample output:
BUG-111, BUG-122, BUG-123, JIRA-221, JIRA-234
My trial code:
my $file1="/path/to/file/bugs.txt";
my $Jira_nums;
open(FH, '<', $file1) or die $!;
{
local $/;
$Jira_nums = <FH>;
}
close FH;
I need help in designing the logic for removing duplicate entries from the file bugs.txt
Upvotes: 0
Views: 474
Reputation: 91385
You just need to add these lines to your script:
my %seen;
my @no_dups = grep{!$seen{$_}++}split/,?\s/,$Jira_nums;
You'll get:
use strict;
use warnings;
use Data::Dumper;
my $file1="/path/to/file/bugs.txt";
my $Jira_nums;
open(my $FH, '<', $file1) or die $!; # use lexical file handler
{
local $/;
$Jira_nums = <$FH>;
}
my %seen;
my @no_dups = grep{!$seen{$_}++}split/,?\s/,$Jira_nums;
say Dumper \@no_dups;
For input data like:
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
it gives:
$VAR1 = [
'BUG-111',
'BUG-122',
'BUG-123',
'JIRA-221',
'JIRA-234'
];
Upvotes: 1
Reputation: 2589
You can try this:
use strict;
use warnings;
my @bugs = "";
@bugs = split /\,?(\s+)/, $_ while(<DATA>);
my @Sequenced = map {$_=~s/\s*//g; $_} RemoveDup(@bugs);
print "@Sequenced\n";
sub RemoveDup { my %checked; grep !$checked{$_}++, @_; }
__DATA__
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
Upvotes: 0