Anthony
Anthony

Reputation: 979

Ruby: Normalize URL by parsing

My goal is to convert any of these:

to:

I did the following:

def parse_url(url)

    uri = URI.parse(url)

    return uri.host if uri.scheme
    return uri.to_s if uri.to_s[0, 4] == 'www.'

    "www.#{url}"
end

But I feel that there might be something more standard I can use as it doesn't look as elegant.

Upvotes: 0

Views: 352

Answers (1)

the Tin Man
the Tin Man

Reputation: 160631

Always make sure the code does what you want first. If there's code smell later then think about different ways to do what you want and pick the one that's most easily understood.

Terse code isn't necessarily the most elegant, often it's just hard to understand which can lead to bugs being introduced later when someone else has to figure out what's wrong and doesn't understand the brilliance that led to the undecipherable logic.

Your code doesn't do what you want:

parse_url('http://google.com') # => "google.com"

Here's what I'd write as a quick and dirty first pass:

require 'uri'

def parse_url(url)

    uri = URI.parse(url)

    # if it's not a generic URI...
    if uri.scheme

      # peek at the host
      url_host = uri.host

      # if it starts with "www." then return it as is...
      if url_host[0,4] == 'www.'
        return url_host

      # else add the prefix and return it
      else
        return 'www.' + uri.host
      end

    # if it's a generic...
    else
      if url[0,4] == 'www.'
        return url
      else
        return 'www.' + url
      end
    end

end

parse_url('http://www.google.com') # => "www.google.com"
parse_url('www.google.com') # => "www.google.com"
parse_url('http://google.com') # => "www.google.com"
parse_url('google.com') # => "www.google.com"

I'm sure I could come up with tighter code, but my concern would be a peer, or my future self, debugging in the early hours of the morning, trying to figure out what's wrong. To be kind to that person I'd rather keep it simple.

Upvotes: 1

Related Questions