sammms
sammms

Reputation: 677

How can I sort an array of strings based on a non standard alphabet?

I'm trying to sort an array of phrases in Esperanto by alphabetical order. Is there a way to use sort_by to accomplish this?

I'm checking each character of the string against its index in the Esperanto alphabet, with each increasing index being a step lower in sorting priority:

  esp_alph = " abcĉdefgĝhĥijĵklmnoprsŝtuŭvz"
  arr.sort_by {|string|  
    [esp_alph.index(string[0]),
     esp_alph.index(string[1]),
     esp_alph.index(string[2]),
     esp_alph.index(string[3])]}

However, this isn't a scalable solution, and it breaks if I have more conditions than I have characters in my string. It seems like I'm right at the cusp of a loop based on my string length, but I can't figure out how to implement it without syntax errors. Or is there a better way to go about solving this issue?

Upvotes: 2

Views: 2181

Answers (3)

Cary Swoveland
Cary Swoveland

Reputation: 110685

ESP_ALPH = "abcĉdefgĝhĥijĵklmnoprsŝtuŭvz"

ESP_MAP  = ESP_ALPH.each_char.with_index.to_a.to_h
  #=> {"a"=> 0, "b"=> 1, "c"=> 2, "ĉ"=> 3, "d"=> 4, "e"=> 5, "f"=> 6,
  #    "g"=> 7, "ĝ"=> 8, "h"=> 9, "ĥ"=>10, "i"=>11, "j"=>12, "ĵ"=>13,
  #    "k"=>14, "l"=>15, "m"=>16, "n"=>17, "o"=>18, "p"=>19, "r"=>20,
  #    "s"=>21, "ŝ"=>22, "t"=>23, "u"=>24, "ŭ"=>25, "v"=>26, "z"=>27}

def sort_esp(str)
  str.each_char.sort_by { |c| ESP_MAP[c] }.join
end

str = ESP_ALPH.chars.shuffle.join
  #=> "hlbzŭvŝerĝoipjafntĵsmgĉdukĥc"

sort_esp(str) == ESP_ALPH
  #=> true

Upvotes: 0

steenslag
steenslag

Reputation: 80065

esp_alph = " abcĉĉdefgĝĝhĥĥijĵĵklmnoprsŝŝtuŭŭvz"

arr = ["abc\u0302a", "abĉa","abca" ]
p arr.sort_by {|string| string.chars.map{|c| esp_alph.index(c)}}
# => ["abca", "abĉa", "abĉa"]

For better performance the esp_alph string should be a Hash, probably.

Upvotes: 1

sawa
sawa

Reputation: 168101

Simply replace all characters in the Esperanto alphabet with some characters in the ASCII table so that the Esperanto alphabet order matches the ASCII order.

Suppose you have the Esperanto alphabets in the order you gave, which I assume are in the order they are supposed to be:

esp_alph = " abcĉdefgĝhĥijĵklmnoprsŝtuŭvz"

and take out any portion of the ASCII character table of the same length (notice that \\ is a single character):

ascii = "@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\"

or

ascii = "@-\\"

Then, you can simply do:

arr.sort_by{|string| string.tr(esp_alph, ascii)}

Here, tr is faster than gsub, and I think it scales enough.

Upvotes: 2

Related Questions