Reputation: 37
awk novice here, was wondering if this is doable.
My file:
CCDDBBAA
EFGHAC
KJLDFU
ABBAAC
Desired output:
ABCD
ACEFGH
DFJKLU
ABC
I want to sort the strings in my file alphabetically and remove the duplicates within the string.
Thanks!
Upvotes: 2
Views: 891
Reputation: 2662
With gawk:
awk -v FS="" '{
for(i=1;i<=NF;i++){
if ($i in a == 0){
a[$i]
}
};
d=asorti(a,b);
for(x=1;x<=d;x++){
printf "%s",b[x]
};
print "";
delete a;
delete b
}'
Upvotes: 1
Reputation: 203493
With GNU awk 4.* for sorted_in
and splitting a record into characters when FS is null:
$ cat tst.awk
BEGIN { FS=OFS=ORS=""; PROCINFO["sorted_in"]="@ind_str_asc" }
{
for (i=1;i<=NF;i++) a[$i]
for (i in a) print i
print RS
delete a
}
$ awk -f tst.awk file
ABCD
ACEFGH
DFJKLU
ABC
Upvotes: 0
Reputation: 246807
perl:
perl -pe '%x = map {$_=>1} split ""; $_ = join "", sort keys %x' file
or ruby:
ruby -pe '$_ = $_.chars.uniq.sort.join("")' file
Upvotes: 0
Reputation: 58401
This might work for you (GNU sed & sort):
sed 's/\s*/\n/g;s/.*/echo "&"|sort -u/e;s/\n//g' file
Remove white space and separate each character by a newline. Sort the lines generated removing duplicates. Remove the introduced newlines.
Upvotes: 1