Bubbles
Bubbles

Reputation: 97

Get first N chars and sort them

I have a requirement where i need to fetch first four characters from each line of file and sort them.

I tried below way. but its not sorting each line

cut -c1-4 simple_file.txt | sort -n

O/p using above:

appl
bana
uoia

Expected output:

alpp
aabn
aiou

Upvotes: 1

Views: 871

Answers (3)

Tom Fenech
Tom Fenech

Reputation: 74595

sort isn't the right tool for the job in this case, as it used to sort lines of input, not the characters within each line.

I know you didn't tag the question with but here's one way you could do it:

perl -F'' -lane 'print(join "", sort @F[0..3])' file

This uses the -a switch to auto-split each line of input on the delimiter specified by -F (in this case, an empty string, so each character is its own element in the array @F). It then sorts the first 4 characters of the array using the standard string comparison order. The result is joined together on an empty string.

Upvotes: 3

Charles Stewart
Charles Stewart

Reputation: 11837

Try defining two helper functions:

explodeword () {
        test -z "$1" && return
        echo ${1:0:1}
        explodeword ${1:1}
}

sortword () {
        echo $(explodeword $1 | sort) | tr -d ' '
}

Then

cut -c1-4 simple_file.txt | while read -r word; do sortword $word; done

will do what you want.

Upvotes: 2

PM 2Ring
PM 2Ring

Reputation: 55469

The sort command is used to sort files line by line, it's not designed to sort the contents of a line. It's not impossible to make sort do what you want, but it would be a bit messy and probably inefficient.

I'd probably do this in Python, but since you might not have Python, here's a short awk command that does what you want.

awk '{split(substr($0,1,4),a,"");n=asort(a);s="";for(i=1;i<=n;i++)s=s a[i];print s}' 

Just put the name of the file (or files) that you want to process at the end of the command line.

Here's some data I used to test the command:

data

this
is a
simple
test file

a
of
apple
banana
cat
uoiea
bye

And here's the output

hist
 ais
imps
estt

a
fo
alpp
aabn
act
eiou
bey

Here's an ugly Python one-liner; it would look a bit nicer as a proper script rather than as a Bash command line:

python -c "import sys;print('\n'.join([''.join(sorted(s[:4])) for s in open(sys.argv[1]).read().splitlines()]))"

In contrast to the awk version, this command can only process a single file, and it reads the whole file into RAM to process it, rather than processing it line by line.

Upvotes: 1

Related Questions