Richard H
Richard H

Reputation: 39105

Bash: sort text file by last field value

I have a text file containing ~300k rows. Each row has a varying number of comma-delimited fields, the last of which is guaranteed numerical. I want to sort the file by this last numerical field. I can't do:

sort -t, -n -k 2 file.in > file.out

as the number of fields in each row is not constant. I think sed, awk maybe the answer, but not sure how. E.g:

awk -F, '{print $NF}' file.in

gives me the last column value, but how to use this to sort the file?

Upvotes: 25

Views: 29907

Answers (6)

Sebastian Wagner
Sebastian Wagner

Reputation: 2526

Python one-liner:

python -c "print ''.join(sorted(open('filename'), key=lambda l: int(l.split(',')[-1])))"

Upvotes: 0

mmrtnt
mmrtnt

Reputation: 392

I'm going to throw mine in here as an alternative (and I couldn't get awk to work) :)

sample file:

Call of Doody                           1322
Seam the Ripper                         1329
Mafia Bots 1                            1109
Chicken Fingers                         1243
Batup Light                             1221
Hunter F Tomcat                         1140
Tober                                   0833

code:

for i in `sed -e 's/.* \(\d\)*/\1/' file.txt | sort`; do grep $i file.txt; done > file_sort.txt

Upvotes: 0

Fred Foo
Fred Foo

Reputation: 363767

Use awk to put the numeric key up front. $NF is the last field of the current record. Sort. Use sed to remove the duplicate key.

awk -F, '{ print $NF, $0 }' yourfile | sort -n -k1 | sed 's/^[0-9][0-9]* //'

Upvotes: 37

Benoit
Benoit

Reputation: 79225

Perl one-liner:

@lines=<STDIN>;foreach(sort{($a=~/.*,(\d+)/)[0]<=>($b=~/.*,(\d+)/)[0]}@lines){print;}

Upvotes: 0

Benoit
Benoit

Reputation: 79225

vim file.in -c '%sort n /.*,\zs/' -c 'saveas file.out' -c 'q'

Upvotes: 3

zwol
zwol

Reputation: 140788

Maybe reverse the fields of each line in the file before sorting? Something like

perl -ne 'chomp; print(join(",",reverse(split(","))),"\n")' |
  sort -t, -n -k1 |
  perl -ne 'chomp; print(join(",",reverse(split(","))),"\n")'

should do it, as long as commas are never quoted in any way. If this is a full-fledged CSV file (in which commas can be quoted with backslash or space) then you need a real CSV parser.

Upvotes: 0

Related Questions