Reputation: 209
I have one text string which is having some duplicate characters (FFGGHHJKL). These can be made unique by using the positive lookahead:
$ perl -pe 's/(.)(?=.*?\1)//g']
For example, with "FFEEDDCCGG"
, the output is "FEDCG"
.
My question is how to make it work on the numbers (Ex. 212 212 43 43 5689 6689 5689 71 81 === output should be 212 43 5689 6689 71 81) ? Also if we want to have only duplicate records to be given as the output from a file having n rows
212 212 43 43 5689 6689 5689 71 81 66 66 67 68 69 69 69 71 71 52 ..
Output:
212 212 43 43 5689 5689 66 66 69 69 69 71 71
How can I do this?
Upvotes: 2
Views: 245
Reputation: 5072
The following is untested, but should print out only the duplicates.
my $line = "212 212 43 43 5689 6689 5689 71 81\n";
chomp $line;
my %seen;
my @order;
foreach my $elem (split /\s+/, $line) {
++$seen{$elem};
push @order, $elem if $seen{$elem} == 2;
}
foreach my $elem (@order) {
print "$elem " x $seen{$elem};
}
print "\n";
For removing duplicates, you can now:
print "$_ " for keys %seen;
BUT that doesn't retain the order. You can do something similar as I did for printing out the dupes only. Or use a module like Tie::Hash::Indexed (thanks, daxim) or Tie::IxHash
Upvotes: 2
Reputation: 139441
For the first part
$ cat prog.pl
#! /usr/bin/perl -lp
my %seen;
$_ = join " " => map $seen{$_}++ ? () : $_ => split;
$ echo 212 212 43 43 5689 6689 5689 71 81 | ./prog.pl
212 43 5689 6689 71 81
For the second part
$ cat prog.pl
#! /usr/bin/perl -lp
my %dups;
my @nums = split;
++$dups{$_} for @nums;
$_ = join " " => grep $dups{$_} > 1 => @nums;
$ cat input
212 212 43 43 5689 6689 5689 71 81
66 66 67 68 69 69 69 71 71 52
$ ./prog.pl input
212 212 43 43 5689 5689
66 66 69 69 69 71 71
Upvotes: 0