Reputation: 42448
What is the simplest way to sort a list of lines, sorting on the last field of each line? Each line may have a variable number of fields.
Something like
sort -k -1
is what I want, but sort(1) does not take negative numbers to select fields from the end instead of the start.
I'd also like to be able to choose the field delimiter too.
Edit: To add some specificity to the question: The list I want to sort is a list of pathnames. The pathnames may be of arbitrary depth hence the variable number of fields. I want to sort on the filename component.
This additional information may change how one manipulates the line to extract the last field (basename(1) may be used), but does not change sorting requirements.
e.g.
/a/b/c/10-foo
/a/b/c/20-bar
/a/b/c/50-baz
/a/d/30-bob
/a/e/f/g/h/01-do-this-first
/a/e/f/g/h/99-local
I want this list sorted on the filenames, which all start with numbers indicating the order the files should be read.
I've added my answer below which is how I am currently doing it. I had hoped there was a simpler way - maybe a different sort utility - perhaps without needing to manipulate the data.
Upvotes: 54
Views: 40699
Reputation: 77
| sed "s#(.*)/#\1"\\$'\x7F'\# \
| sort -t\\$'\x7F' -k2,2 \
| sed s\#\\$'\x7F'"#/#"
Still way worse than simple negative field indexes for sort(1) but using the DEL character as delimiter shouldn’t cause any problem in this case.
I also like how symmetrical it is.
Upvotes: 0
Reputation: 14835
Here is a python oneliner version, note that it assumes the field is integer, you can change that as needed.
echo file.txt | python3 -c 'import sys; list(map(sys.stdout.write, sorted(sys.stdin, key=lambda x: int(x.rsplit(" ", 1)[-1]))))'
Upvotes: 0
Reputation: 11185
I want this list sorted on the filenames, which all start with numbers indicating the order the files should be read.
find . | sed 's#.*/##' | sort
the sed replaces all parts of the list of results that ends in slashes. the filenames are whats left, and you sort on that.
Upvotes: 1
Reputation: 381
awk '{print $NF,$0}' file | sort | cut -f2- -d' '
Basically, this command does:
Upvotes: 38
Reputation: 86718
Here's a Perl command line (note that your shell may require you to escape the $
s):
perl -e "print sort {(split '/', $a)[-1] <=> (split '/', $b)[-1]} <>"
Just pipe the list into it or, if the list is in a file, put the filename at the end of the command line.
Note that this script does not actually change the data, so you don't have to be careful about what delimeter you use.
Here's sample output:
>perl -e "print sort {(split '/', $a)[-1] <=> (split '/', $b)[-1]} " files.txt /a/e/f/g/h/01-do-this-first /a/b/c/10-foo /a/b/c/20-bar /a/d/30-bob /a/b/c/50-baz /a/e/f/g/h/99-local
Upvotes: 14
Reputation: 42448
Replace the last delimiter on the line with another delimiter that does not otherwise appear in the list, sort on the second field using that other delimiter as the sort(1) delimiter, and then revert the delimiter change.
delim=/
new_delim=" "
cat $list \
| sed "s|\(.*\)$delim|\1$new_delim|" \
| sort -t"$new_delim" -k 2,2 \
| sed "s|$new_delim|$delim|"
The problem is knowing what delimiter to use that does not appear in the list. You can make multiple passes over the list and then grep for a succession of potential delimiters, but it's all rather nasty - particularly when the concept of "sort on the last field of a line" is so simply expressed, yet the solution is not.
Edit: One safe delimiter to use for $new_delim is NUL since that cannot appear in filenames, but I don't know how to put a NUL character into a bourne/POSIX shell script (not bash) and whether sort and sed will properly handle it.
Upvotes: 2
Reputation: 104050
#!/usr/bin/ruby
f = ARGF.read
lines = f.lines
broken = lines.map {|l| l.split(/:/) }
sorted = broken.sort {|a, b|
a[-1] <=> b[-1]
}
fixed = sorted.map {|s| s.join(":") }
puts fixed
If all the answers involve perl or awk, might as well solve the whole thing in the scripting language. (Incidentally, I tried in Perl first and quickly remembered that I dislike Perl's lists-of-lists. I'd love to see a Perl guru's version.)
Upvotes: 0
Reputation: 342333
something like this
awk '{print $NF"|"$0}' file | sort -t"|" -k1 | awk -F"|" '{print $NF }'
Upvotes: 9
Reputation: 3253
I think the only solution would be to use awk
:
awk
.Upvotes: 2
Reputation: 1075
A one-liner in perl for reversing the order of the fields in a line:
perl -lne 'print join " ", reverse split / /'
You could use it once, pipe the output to sort, then pipe it back and you'd achieve what you want. You can change / /
to / +/
so it squeezes spaces. And you're of course free to use whatever regular expression you want to split the lines.
Upvotes: 3
Reputation: 29011
sort
allows you to specify the delimiter with the -t
option, if I remember it well. To compute the last field, you can do something like counting the number of delimiters in a line and sum one. For instance something like this (assuming the ":" delimiter):
d=`head -1 FILE | tr -cd : | wc -c`
d=`expr $d + 1`
($d
now contains the last field index).
Upvotes: -1