Daniel Abrahamsson
Daniel Abrahamsson

Reputation: 1965

Ruby - internationalized domain names

I need to support internationalized domain names in an app I am writing. More specifically, I need to ACE encode domain names before I pass them on to an external API.

The best way to do this seems to be by using libidn. However, I have problems installing it on my development machine (Windows 7, ruby 1.8.6), as it complains about not finding the GNU IDN library (which I have installed, and also provided the full path to).

So basically I am considering two things:

  1. Search the web for a prebuilt win32 libidn gem (fruitless so far)

  2. Find another (hopefully pure) ruby library that can do the same thing (not found apperantly as I am asking this question here)

So have anyone of you got libidn to work under Windows? Or have you used some other library/code snippet that is able to encode domain names?

Upvotes: 3

Views: 613

Answers (1)

Daniel Abrahamsson
Daniel Abrahamsson

Reputation: 1965

Thanks to this snippet, I finally found a solution that did not require libidn. It is built upon punicode4r together with either the unicode gem (a prebuilt binary can be found here), or with ActiveSupport. I will use ActiveSupport since I use Rails anyway, but for reference I include both methods.

With the unicode gem:

require 'unicode'
require 'punycode' #This is not a gem, but a standalone file.

   def idn_encode(domain)
    parts = domain.split(".").map do |label|
        encoded = Punycode.encode(Unicode::normalize_KC(Unicode::downcase(label)))
        if encoded =~ /-$/ #Pure ASCII
            encoded.chop!
        else #Contains non-ASCII characters
            "xn--" + encoded
        end
    end
    parts.join(".")
end

With ActiveSupport:

require "punycode"
require "active_support"
$KCODE = "UTF-8" #Have to set this to enable mb_chars

def idn_encode(domain)
    parts = domain.split(".").map do |label|
        encoded = Punycode.encode(label.mb_chars.downcase.normalize(:kc))
        if encoded =~ /-$/ #Pure ASCII
            encoded.chop! #Remove trailing '-'
        else #Contains non-ASCII characters
            "xn--" + encoded
        end
    end
    parts.join(".")
end

The ActiveSupport solution was found thanks to this StackOverflow question.

Upvotes: 3

Related Questions