BlueHam
BlueHam

Reputation: 37

awk - Sort a string alphabetically and remove duplicates within the string

awk novice here, was wondering if this is doable.

My file:

CCDDBBAA 
EFGHAC 
KJLDFU
ABBAAC

Desired output:

ABCD
ACEFGH
DFJKLU
ABC

I want to sort the strings in my file alphabetically and remove the duplicates within the string.

Thanks!

Upvotes: 2

Views: 891

Answers (4)

jijinp
jijinp

Reputation: 2662

With gawk:

 awk -v FS="" '{
    for(i=1;i<=NF;i++){
        if ($i in a == 0){
            a[$i]
        }
    };
    d=asorti(a,b);
    for(x=1;x<=d;x++){
        printf "%s",b[x]
    };
    print "";
    delete a;
    delete b
    }'

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203493

With GNU awk 4.* for sorted_in and splitting a record into characters when FS is null:

$ cat tst.awk
BEGIN { FS=OFS=ORS=""; PROCINFO["sorted_in"]="@ind_str_asc" }
{
    for (i=1;i<=NF;i++) a[$i]
    for (i in a) print i
    print RS
    delete a
}

$ awk -f tst.awk file
ABCD
ACEFGH
DFJKLU
ABC

Upvotes: 0

glenn jackman
glenn jackman

Reputation: 246807

perl:

perl -pe '%x = map {$_=>1} split ""; $_ = join "", sort keys %x' file

or ruby:

ruby -pe '$_ = $_.chars.uniq.sort.join("")' file

Upvotes: 0

potong
potong

Reputation: 58401

This might work for you (GNU sed & sort):

sed 's/\s*/\n/g;s/.*/echo "&"|sort -u/e;s/\n//g' file

Remove white space and separate each character by a newline. Sort the lines generated removing duplicates. Remove the introduced newlines.

Upvotes: 1

Related Questions