Reputation: 1917
I would like to capitalize each word of a UTF-8 string. However, I need the function to ignore some special characters in the beginning of words, like "(-.,". The function will be used to capitalize song titles which can look like this:
marko, gabriel boni, simple jack - recall (original mix)
...would output:
Marko, Gabriel Boni, Simple Jack - Recall (Original Mix)
It should also be able to capitalize UTF-8 chars like "å" > "Å". "é" > "É".
Upvotes: 4
Views: 1861
Reputation: 7719
Is there something why Unicode::capitalize method from unicode library does not suit your needs ?
irb(main):013:0> require 'unicode'
=> true
irb(main):014:0> begin Unicode::capitalize 'åäöéèí' rescue $stderr.print "unicode error\n" end
=> "Åäöéèí"
irb(main):015:0> begin Unicode::capitalize '-åäöéèí' rescue $stderr.print "unicode error\n" end
=> "-åäöéèí"
Upvotes: 8
Reputation: 83680
"åbc".mb_chars.capitalize
#=> "Åbc"
"ébc".mb_chars.capitalize.to_s
#=> "Ébc"
UPD
And to ignore none word chars:
string = "-åbc"
str = string.match(/^(\W*)(.*)/)
str[1] + str[2].mb_chars.capitalize.to_s
#=> "-Åbc"
Upvotes: 4
Reputation: 42865
I did this and wanted to filter a lot of things.
I created a constants file initializers/constants.rb
letters = ("a".."z").collect
numbers = ("1".."9").collect
symbols = %w[! @ # $ % ^ & * ( ) _ - + = | \] { } : ; ' " ? / > . < , ]
FILTER = letters + numbers + symbols
And then just did a check to see if it was in my filter:
if !FILTER.include?(c)
#no
else
#yes
end
You can also check the value of the unicode but you need to know the range or specific values. I did this with chinese characters, so that is where I got my values. I will post some code just to give you an idea:
def check(char)
char = char.unpack('U*').first
if char >= 0x4E00 && char <= 0x9FFF
return true
end
if char >= 0x3400 && char <= 0x4DBF
return true
end
if char >= 0x20000 && char <= 0x2A6DF
return true
end
if char >= 0x2A700 && char <= 0x2B73F
return true
end
return false
end
You need to know the specific values here of course.
Upvotes: 1