Reputation: 2227
I want to remove non-alphanumeric character in a string, but not remove international characters, like accented letters. I also want to keep whitespace. Here is what I have so far:
the_string = the_string.gsub(/[^a-z0-9 -]/i, '')
This does remove international accented alpha characters though.
Solution that I used:
the_string = the_string.gsub(/[^\p{Alnum}\p{Space}-]/u, '')
It works! Thanks.
Upvotes: 9
Views: 4367
Reputation: 79723
You can use character properties to do this:
the_string.gsub(/[^\p{Alnum} -]/, '')
You may also want to use \p{Space}
to keep other whitespace such as non-breaking spaces etc.:
the_string.gsub(/[^\p{Alnum}\p{Space}-]/, '')
(This also keeps the -
character, which you have in your regexp.)
Upvotes: 12