f. c.
f. c.

Reputation: 1135

sort filenames by the numbers after the last occurence of a character

I want to sort the filenames by the numbers after the last occurence of s, How to do it?

For example:

01002M00T1relaxouspios1001a001.nii
01002M00T1relaxos301a001.nii
01002M00T1relaxos201a001.nii
01002M00T1relaxouspios901a001.nii

The sorted filenames should be:

01002M00T1relaxos201a001.nii
01002M00T1relaxos301a001.nii
01002M00T1relaxouspios901a001.nii
01002M00T1relaxouspios1001a001.nii

I tried to use

$sort -ts -nk2,4

But it only works for the first 2 filenames.

Example 2:

01002M00T1relaxos201a001.nii 
01002M00T1relaxouspios1001a001.nii 
01002M00T130relaxos301a001.nii 
01002M00T130relaxouspios901a001.ni

Expected Output:

01002M00T1relaxos201a001.nii 
01002M00T130relaxos301a001.nii 
01002M00T130relaxouspios901a001.ni
01002M00T1relaxouspios1001a001.nii 

Upvotes: 1

Views: 223

Answers (3)

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70752

There is an elegant bash only solution if the number is uniq and sufficient to be an index. The idea is to use bash array but not associative:

unset sortedlist
declare -a sortedlist
while read filename;do
    [[ $filename =~ s([0-9]+)[a-rt-z][^s]*$ ]] &&
       sortedlist[${BASH_REMATCH[1]}]=$filename
  done < <(ls)
printf "%s\n" "${sortedlist[@]}"
01002M00T1relaxos201a001.nii
01002M00T1relaxos301a001.nii
01002M00T1relaxouspios901a001.nii
01002M00T1relaxouspios1001a001.nii

Nota: in an array, fields are sorted in numerical order, so 1001 is greater than 901.

Nota2: as [[ ... =~ ... ]] is a condition, filenames not matching the regular expression would simply by ignored.

Upvotes: 2

Chris Seymour
Chris Seymour

Reputation: 85775

Simple, do a version sort using the -V option:

$ sort -V file

01002M00T1relaxos201a001.nii
01002M00T1relaxos301a001.nii
01002M00T1relaxouspios901a001.nii
01002M00T1relaxouspios1001a001.nii

Edit after 2nd example posted.

General Case:

$ ls | awk -Fs '{print $NF, $0}' | sort -n | awk '{print $2}'

01002M00T1relaxos201a001.nii
01002M00T130relaxos301a001.nii
01002M00T130relaxouspios901a001.ni
01002M00T1relaxouspios1001a001.nii

Upvotes: 2

David W.
David W.

Reputation: 107040

You could use a programming language like Perl or Python that give you a bit more of an ability to specify what you want to sort, but if you insist on BASH, you'll have to employ a little trick:

You can create a sort key using sed, sort based upon that sort key you created, and then remove that key by using `sed:

ls | sed 's/\(.*s\)\(.*\)/\1\2 ^ \2)

The above uses the greedy powers of regular expressions to get what you want. The \(.*s\) will match everything to the final lowercase s. The \(.*\) will match everything that s. The \1 and \2 match the various sections captured in the \(...\) groups. Thus, I have two strings; The first is the name of the file, the second is the sorting string. The output will look like this:

$ ls | sed 's/\(.*s\)\(.*\)/\1\2 ^ \2/'
01002M00T1relaxouspios1001a001.nii ^ 1001a001.nii
01002M00T1relaxos301a001.nii ^ 301a001.nii
01002M00T1relaxos201a001.nii ^ 201a001.nii
01002M00T1relaxouspios901a001.nii ^ 901a001.nii

Now, I can sort on the part after the ^:

$ ls | sed 's/\(.*s\)\(.*\)/\1\2 ^ \2/' | sort -t^ -k2.2
01002M00T1relaxouspios1001a001.nii ^ 1001a001.nii
01002M00T1relaxos201a001.nii ^ 201a001.nii
01002M00T1relaxos301a001.nii ^ 301a001.nii
01002M00T1relaxouspios901a001.nii ^ 901a001.nii

Now, all I have to do is remove that sort key:

$ ls | sed 's/\(.*s\)\(.*\)/\1\2 ^ \2/' | sort -t^ -k2.2| sed 's/ ^ .*//'
01002M00T1relaxouspios1001a001.nii
01002M00T1relaxos201a001.nii
01002M00T1relaxos301a001.nii
01002M00T1relaxouspios901a001.nii

Upvotes: 2

Related Questions