Python 2.7 - Pandas UnicodeEncodeError with data from pyodbc

Question

I'm trying to pull data from SQL Server using pyodbc and load it into a dataframe, then export it to an HTML file, except I keep receiving the following Unicode error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 15500: ordinal not in range(128)

Here is my current setup (encoding instructions per docs):

cnxn =  pyodbc.connect('DSN=Planning;UID=USER;PWD=PASSWORD;')
cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='cp1252', to=unicode)
cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='cp1252', to=unicode)
cnxn.setdecoding(pyodbc.SQL_WMETADATA, encoding='cp1252', to=unicode)
cnxn.setencoding(str, encoding='utf-8')
cnxn.setencoding(unicode, encoding='utf-8')
cursor = cnxn.cursor()

with open('Initial Dataset.sql') as f:
    initial_query = f.read()

cursor.execute(initial_query)
columns = [column[0] for column in cursor.description]
initial_data = cursor.fetchall()
i_df = pd.DataFrame.from_records(initial_data, columns=columns)
i_df.to_html('initial.html')

An odd but useful point to note is that when I try to export a CSV:

i_df.to_csv('initial.csv')

I get the same error, however when I add:

i_df.to_csv('initial.csv', encoding='utf-8')

It works. Can someone help me understand this encoding issue?

Side note: I've also tried using a sqlalchemy connection and pandas.read_sql() and the same error persists.

Python 2.7 - Pandas UnicodeEncodeError with data from pyodbc

Answers (1)

Related Questions