tuomur
tuomur

Reputation: 7098

Force character set conversion

I have an application that writes data to a Microsoft SQL Server. Database's character set is CP1252, and to-be-saved incoming data is in UTF-8. The data may contain characters that cannot be converted to CP1252, and will throw an exception when inserted.

The database guys said that I should just crunch the data to CP1252 forcibly, like this:

some_value = some_value.encode('CP1252', 'replace')

But SQLAlchemy does the conversion automatically and I don't see a way to force the conversion.

engine = sqlalchemy.create_engine('mssql+pyodbc://...'
    encoding='CP1252',
    convert_unicode=True,
    )

It is critical that the data is saved, even with some missing characters. How can I implement this? Note that I'm using a lot of database reflection in this case.

Upvotes: 0

Views: 310

Answers (1)

Esailija
Esailija

Reputation: 140236

I don't see a problem.

some_value = some_value.encode('CP1252', 'replace').decode('CP1252')

If some_value isn't actually unicode string but the raw UTF-8 data:

some_value = some_value.decode("utf-8").encode('cp1252', 'replace').decode('cp1252')

Upvotes: 1

Related Questions