Kevin K
Kevin K

Reputation: 2227

Removing non-alphanumeric characters without removing international characters in ruby

I want to remove non-alphanumeric character in a string, but not remove international characters, like accented letters. I also want to keep whitespace. Here is what I have so far:

the_string = the_string.gsub(/[^a-z0-9 -]/i, '')

This does remove international accented alpha characters though.

Solution that I used:

the_string = the_string.gsub(/[^\p{Alnum}\p{Space}-]/u, '')

It works! Thanks.

Upvotes: 9

Views: 4367

Answers (1)

matt
matt

Reputation: 79723

You can use character properties to do this:

the_string.gsub(/[^\p{Alnum} -]/, '')

You may also want to use \p{Space} to keep other whitespace such as non-breaking spaces etc.:

the_string.gsub(/[^\p{Alnum}\p{Space}-]/, '')

(This also keeps the - character, which you have in your regexp.)

Upvotes: 12

Related Questions