Reputation: 589
I need the Ruby idiom for sorting on two fields. In Python if you sort a list of two-element tuples, it sorts based on the first element, and if two elements are equal then the sort is based on the second element.
One example is the following sorting code in Python (word sort from longest to shortest and consider second element to break ties) from http://www.pythonlearn.com/html-008/cfbook011.html
txt = 'but soft what light in yonder window breaks'
words = txt.split()
t = list()
for word in words:
t.append((len(word), word))
t.sort(reverse=True)
res = list()
for length, word in t:
res.append(word)
print res
What I came up in Ruby is the following code that uses structs
txt = 'but soft what light in yonder window breaks'
words = txt.split()
t = []
tuple = Struct.new(:len, :word)
for word in words
tpl = tuple.new
tpl.len = word.length
tpl.word = word
t << tpl
end
t = t.sort {|a, b| a[:len] == b[:len] ?
b[:word] <=> a[:word] : b[:len] <=> a[:len]
}
res = []
for x in t
res << x.word
end
puts res
I would like to know if there are better ways (less code) to achieve this stable sort.
Upvotes: 2
Views: 1344
Reputation: 110685
Ruby makes this easy, using Enumerable#sort_by will and Array#<=> for sorting.
def sort_on_two(arr, &proc)
arr.map.sort_by { |e| [proc[e], e] }.reverse
end
txt = 'but soft what light in yonder window breaks'
sort_on_two(txt.split) { |e| e.size }
#=> ["yonder", "window", "breaks", "light", "what", "soft", "but", "in"]
sort_on_two(txt.split) { |e| e.count('aeiou') }
#=> ["yonder", "window", "breaks", "what", "soft", "light", "in", "but"]
sort_on_two(txt.split) { |e| [e.count('aeiou'), e.size] }
#=> ["yonder", "window", "breaks", "light", "what", "soft", "but", "in"]
Note that in recent versions of Ruby, proc.call(e)
can be written proc[e]
, proc.yield(e)
or proc.(e)
.
Upvotes: 1
Reputation: 106027
I think you're overthinking this.
txt = 'but soft what light in yonder window breaks'
lengths_words = txt.split.map {|word| [ word.size, word ] }
# => [ [ 3, "but" ], [ 4, "soft" ], [ 4, "what" ], [ 5, "light" ], ... ]
sorted = lengths_words.sort
# => [ [ 2, "in" ], [ 3, "but" ], [ 4, "soft" ], [ 4, "what" ], ... ]
If you really want to use Struct, you can:
tuple = Struct.new(:length, :word)
tuples = txt.split.map {|word| tuple.new(word.size, word) }
# => [ #<struct length=3, word="but">, #<struct length=4, word="soft">, ... ]
sorted = tuples.sort_by {|tuple| [ tuple.length, tuple.word ] }
# => [ #<struct length=2, word="in">, #<struct length=3, word="but">, ... ]
This is equivalent:
sorted = tuples.sort {|tuple, other| tuple.length == other.length ?
tuple.word <=> other.word : tuple.length <=> other.length }
(Note that it's sort
this time, not sort_by
.)
...but since we're using a Struct we can make this nicer by defining our own comparison operator (<=>
), which sort
will invoke (the same works in any Ruby class):
tuple = Struct.new(:length, :word) do
def <=>(other)
[ length, word ] <=> [ other.length, other.word ]
end
end
tuples = txt.split.map {|word| tuple.new(word.size, word) }
tuples.sort
# => [ #<struct length=2, word="in">, #<struct length=3, word="but">, ... ]
There are other options for more complex sorting. If you wanted to get longest words first, for example:
lengths_words = txt.split.map {|word| [ word.size, word ] }
sorted = lengths_words.sort_by {|length, word| [ -length, word ] }
# => [ [ 6, "breaks" ], [ 6, "window" ], [ 6, "yonder" ], [ 5, "light" ], ... ]
Or:
tuple = Struct.new(:length, :word) do
def <=>(other)
[ -length, word ] <=> [ -other.length, other.word ]
end
end
txt.split.map {|word| tuple.new(word.size, word) }.sort
# => [ #<struct length=6, word="breaks">, #<struct length=6, word="window">, #<struct length=6, word="yonder">, ... ]
As you can see, I'm relying a lot on Ruby's built-in ability to sort arrays based on their contents, but you can also "roll your own" if you prefer, which might perform better with many, many items. Here's a comparison method that's equivalent to your t.sort {|a, b| a[:len] == b[:len] ? ... }
code (plus a bonus to_s
method):
tuple = Struct.new(:length, :word) do
def <=>(other)
return word <=> other.word if length == other.length
length <=> other.length
end
def to_s
"#{word} (#{length})"
end
end
sorted = txt.split.map {|word| tuple.new(word.size, word) }.sort
puts sorted.join(", ")
# => in (2), but (3), soft (4), what (4), light (5), breaks (6), window (6), yonder (6)
Finally, a couple comments on your Ruby style:
You pretty much never see for
in idiomatic Ruby code. each
is the idiomatic way to do almost all iteration in Ruby, and "functional" methods like map
, reduce
and select
are also common. Never for
.
A great advantage of Struct is that you get accessor methods for each property, so you can do tuple.word
instead of tuple[:word]
.
Methods with no arguments are called without parentheses: txt.split.map
, not txt.split().map
Upvotes: 5
Reputation: 4577
UPDATE: my first answer was wrong (this time!), thanks to @mu is too short comment
Your code is ok to sort on two criteria, but if you just want to achieve the same result, the best is to do the following:
txt.split.sort_by{|a| [a.size,a] }.reverse
=> ["breaks", "window", "yonder", "light", "soft", "what", "but", "in"]
The first check will use the size operator, and if the result is zero, it will use the second one....
If you really want to keep your data structure, it's same principle:
t.sort_by{ |a| [a[:len],a[:word]] }.reverse
Upvotes: 0