Reputation: 9978
I have a Rails application that allows users to import information from various sources using RSS feeds and such. My default encoding on the database is UTF8 and I've been receiving a lot of exceptions in regards to non-UTF8 data that is coming through the system and crashing once it hits the database.
I'm to appropriately detect the non-UTF8 data using the is_utf8? method on the attributes before a save is done, but I haven't come up with a way to handle it. I've seen iconv to convert but it appears that requires being able to determine what kind of encoding I'm converting from.
Is there a simple way to do a guess conversion or possibly just strip out the non-UTF8 characters and then do the save into the database?
Thanks!
Upvotes: 2
Views: 1294
Reputation: 536359
How is non-UTF-8 data making it into the system? Make sure all your pages are served as Content-Type text/html;charset=utf-8 and browsers will always submit UTF-8 data to your forms.
(Of course that still leaves things like mail and uploaded files, but a lot of those kinds of specific context often give you an encoding to go on.)
Upvotes: 1