Andreus
Andreus

Reputation: 2487

Python3, pyodbc, SQL Server: Supplying Unicode and ANSI string as needed

I use Python, and currently Python3, to access legacy (read "not going to be changed") databases with ANSI char/varchar columns/indexes.

I just discovered a major performance problem (with a new database over which I have full control) was solved by converting the database to nchar/nvarchar (see this article), so that my queries, columns and indexes were all aligned to use Unicode.

Which begs the question, and I have thus far been unable to Google it, how do I give pyodbc a non-Unicode string in Python3 so that it correctly passes the string on to ODBC/SQL Server as a non-unicode string? This has significant performance implications for a number of e.g. data-mining applications.

This seems to work, but is it correct?

conn = pyodbc.connect( connection_string )
curr = conn.cursor()
aString = 'Howdy!'
query = 'select * from aTable where aColumn = ?'
results = curr.execute( q, [aString.encode('ascii')] )

Alternately/additionally, is it more appropriate and/or possible to build a Unicode index over a non-Unicode column in SQL Server? (I have sufficient control of the database to add indexes).

Upvotes: 2

Views: 975

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123519

is it correct?

Based on what SQL Profiler and SQL Server Management Studio (SSMS) say while testing with SQL Server ODBC under Windows, it seems so, assuming that the string values really will be restricted to ASCII characters.

If we just pass [aString] as the query parameter, SQL Profiler shows that pyodbc sends this

exec sp_prepexec @p1 output,N'@P1 nvarchar(6)',N'select * from aTable where aColumn = @P1',N'Howdy!'

and if we ask SSMS to show us the estimated execution plan for

select * from aTable where aColumn = N'Howdy!'

it tells us that it expects to do an Index Scan.

However, if we pass [aString.encode('ascii')] as the query parameter, SQL Profiler shows that pyodbc sends this

exec sp_prepexec @p1 output,N'@P1 varbinary(6)',N'select * from aTable where aColumn = @P1',0x486F77647921

and if we ask SSMS to show us the estimated execution plan for

select * from aTable where aColumn = 0x486F77647921

it tells us that it expects to do an Index Seek.

A "Seek" is usually better than a "Scan", so if the queries actually return the correct results I'd expect better performance from using the encoded parameter.

Upvotes: 1

Related Questions