bigpotato
bigpotato

Reputation: 27507

Ruby/Rails : How to remove all the unicode in a string?

I have a lot of records in a legacy database and I need to export that data into a CSV using the ISO-1889-1 format because there are spanish characters.

However, when converting it to utf-8 to iso, it keeps throwing errors of not being able to convert certain characters. ex:

Encoding::UndefinedConversionError - U+2026 from UTF-8 to ISO-8859-1

this happens at this line in my controller:

send_data(data.encode("iso-8859-1"), filename: "books_data_#{date}.csv", type: 'text/csv; charset=iso-8859-1; header=present')

To fix this individual issue I just did string.gsub!("…", ""). Is there a more universal way to just remove all unicode in a ruby string? Doing it by hand for each one that appears is not as complete, ugly, and hard to maintain if new unicode characters arise. Just wondering.

Upvotes: 0

Views: 1789

Answers (1)

utdemir
utdemir

Reputation: 27216

Are you looking for String::encode?

irb(main):011:0> "Здравствуйте Stack Overflow!".encode("iso-8859-1", undef: :replace, replace: "")
=> " Stack Overflow!"

Upvotes: 2

Related Questions