Reputation: 11
I'm a Unix shell script newbie. I know several different way to find duplicates. But can't find a simple way to remove duplicates while maintaining original order (since using sort -u loses original order).
Example: script called dedupe.sh
sample run:
dedupe.sh
cat dog cat bird fish bear dog
results in: cat dog bird fish bear
Upvotes: 1
Views: 2608
Reputation: 5786
Ahh perl... the write-only language. :)
As long as you're calling out to another scripting language, might as well consider something readable. :)
#!/usr/bin/env ruby
puts ARGV.uniq.join(' ')
which means:
puts = "print whatever comes next"
ARGV = "input argument array"
uniq = "array method to perform the behavior you're looking for and remove duplicates"
join(' ') = "join with spaces instead of default of newline. Not necessarily needed if you're piping to something else"
Upvotes: 0
Reputation: 185025
Using awk :
$ printf '%s\n' cat dog cat bird fish bear dog | awk '!arr[$1]++'
cat
dog
bird
fish
bear
or
$ echo 'cat dog cat bird fish bear dog' | awk '!arr[$1]++' RS=" "
or
$ printf '%s\n' cat dog cat bird fish bear dog | sort -u
If it works in a shell, it will works in a script =)
Upvotes: 2
Reputation: 4806
Did you say Perl?
perl -e 'while($_=shift@ARGV){$seen{$_}++||print}print"\n" ' \
cat dog cat bird fish bear dog
Equivalently, dedupe.pl
contains:
#!/usr/bin/perl
while ($w = shift @ARGV) {
$seen{$w}++ || print "$w";
}
print "\n";
Now chmod u+x dedupe.pl
and:
./dedupe.pl cat dog cat bird fish bear dog
Either way, output is as desired.
cat dog bird fish bear
Upvotes: 1