Reputation: 74
What's a smart and easy way to remove dupes (not necessarily consecutive) within delimited items on a line.
BEFORE:
apple,banana,apple,cherry,cherry
delta,epsilon,delta,epsilon
apple pie,delta,delta
AFTER:
apple,banana,cherry
delta,epsilon
apple pie,delta
Should work on a Mac. Allow unicode. Any shell method/language/command. Dupes are not necessarily consecutive.
Note: this question is a variation of How to remove dupes from blocks of text -- which is for blocks of text separated with blank lines.
Upvotes: 1
Views: 65
Reputation: 1456
awk -F, '{ for(i=1;i<=NF;i++) if( split($0,t,$i)>2 ) sub($i",","") }1' file
banana,apple,cherry
delta,epsilon
apple pie,delta
sed version:
sed -r 's/(.+)(.*),\1/\1\2,/g;s/,$//' file
apple,banana,cherry
delta,epsilon
apple pie,delta
Just Code.
Upvotes: 1
Reputation: 203985
$ awk 'BEGIN { FS=OFS="," }
{
delete seen
sep=""
for (i=1;i<=NF;i++) {
if (!seen[$i]++) {
printf "%s%s", sep, $i
sep = OFS
}
}
print ""
}' file
apple,banana,cherry
delta,epsilon
apple pie,delta
Upvotes: 1