Reputation: 249
I have a display name field which I have to validate using Ruby regex. We have to match all language characters like French, Arabic, Chinese, German, Spanish in addition to English language characters except special characters like *()!@#$%^&.... I am stuck on how to match those non-Latin characters.
Upvotes: 3
Views: 3938
Reputation: 571
In ruby > 1.9.1 (maybe earlier) one can use \p{L}
to match word characters in all languages (without the oniguruma gem as described in a previous answer).
Upvotes: 1
Reputation: 14973
Starting from Ruby 1.9, the String
and Regex
classes are unicode aware. You can safely use the Regex word character selector \w
"可口可樂!?!".gsub /\w/, 'Ha'
#=> "HaHaHaHa!?!"
Upvotes: 1
Reputation: 85458
There are two possibilities:
Create a regex with a negated character class containing every symbol you don't want to match:
if ( name ~= /[^*!@%\^]/ ) # add everything and if this matches you are good
This solution may not be feasible, since there is a massive amount of symbols you'd have to insert, even if you were just to include the most common ones.
Use Oniguruma (see also: Oniguruma for Ruby main). This supports Unicode and their properties; in which case all letters can be matched using:
if ( name ~= /[\pL\pM]/ )
You can see what these are all about here: Unicode Regular Expressions
Upvotes: 3