Supersonic
Supersonic

Reputation: 430

Character encoding conversion

I have a string which contains Swedish characters and want to convert it to basic English.

name = "LänödmåtnÖng ÅjädårbÄn" 

These characters should be converted as follows:

Is there a simple way to do it? If I try:

ascii_to_string = name.unpack("U*").map{|s|s.chr}.join

It returns L\xE4n\xF6dm\xE5tn\xD6ng \xC5j\xE4d\xE5rb\xC4n as ASCII, but I want to convert it to English.

Upvotes: 0

Views: 97

Answers (3)

steenslag
steenslag

Reputation: 80065

Using OP's conversion table as input for the tr method:

#encoding: utf-8
name = "LänödmåtnÖng ÅjädårbÄn" 
p name.tr("ÅåÄäÖö", "AaAaOo") #=> "LanodmatnOng AjadarbAn"

Upvotes: 3

cthulhu
cthulhu

Reputation: 3726

You already got decent answer, however there is a way that is easier to remember (no magical regular expressions):

name.parameterize

It changes whitespaces to dashes, so you need to handle it somehow, for example by processing each word separately:

name.split.map { |s| s.parameterize }.join ' '

Upvotes: 1

Ivaylo Strandjev
Ivaylo Strandjev

Reputation: 70931

Try this:

string.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s

As found in this post.

Upvotes: 1

Related Questions