user2180616
user2180616

Reputation: 11

Find & remove duplicate strings via Unix shell script. How to?

I'm a Unix shell script newbie. I know several different way to find duplicates. But can't find a simple way to remove duplicates while maintaining original order (since using sort -u loses original order).

Example: script called dedupe.sh

sample run:

dedupe.sh

cat dog cat bird fish bear dog

results in: cat dog bird fish bear

Upvotes: 1

Views: 2608

Answers (3)

Eric Haynes
Eric Haynes

Reputation: 5786

Ahh perl... the write-only language. :)

As long as you're calling out to another scripting language, might as well consider something readable. :)

#!/usr/bin/env ruby

puts ARGV.uniq.join(' ')

which means:

puts = "print whatever comes next"
ARGV = "input argument array"
uniq = "array method to perform the behavior you're looking for and remove duplicates"
join(' ') = "join with spaces instead of default of newline. Not necessarily needed if you're piping to something else"

Upvotes: 0

Gilles Quénot
Gilles Quénot

Reputation: 185025

Using :

$ printf '%s\n' cat dog cat bird fish bear dog | awk '!arr[$1]++'
cat
dog
bird
fish
bear

or

$ echo 'cat dog cat bird fish bear dog' | awk '!arr[$1]++' RS=" "

or

$ printf '%s\n' cat dog cat bird fish bear dog | sort -u

If it works in a , it will works in a script =)

Upvotes: 2

minopret
minopret

Reputation: 4806

Did you say Perl?

perl -e 'while($_=shift@ARGV){$seen{$_}++||print}print"\n" ' \
cat dog cat bird fish bear dog

Equivalently, dedupe.pl contains:

#!/usr/bin/perl
while ($w = shift @ARGV) {
    $seen{$w}++ || print "$w";
}
print "\n";

Now chmod u+x dedupe.pl and:

./dedupe.pl cat dog cat bird fish bear dog

Either way, output is as desired.

cat dog bird fish bear 

Upvotes: 1

Related Questions