MCG Code
MCG Code

Reputation: 1393

Pandas error when creating DataFrame from MS SQL Server database: 'ODBC SQL type -151 is not yet supported

I'm trying to create a DataFrame from a table in MS SQL Server 2016, I have used the sample database AdventureWorks2012, and here is the code:

import pyodbc 
cnxn = pyodbc.connect("Driver={ODBC Driver 13 for SQL Server};"
                      "Server=localhost;"
                      "Database=AdventureWorks2012;"
                      "Trusted_Connection=yes;")


cursor = cnxn.cursor()
cursor.execute('SELECT * FROM HumanResources.Employee')

df = pandas.read_sql(sql, cnxn)
cursor.close()
cnxn.close()

but I get an error:

----> 1 df = pandas.read_sql(sql, cnxn)

ProgrammingError: ('ODBC SQL type -151 is not yet supported. column-index=3 type=-151', 'HY106')

Upvotes: 4

Views: 5252

Answers (2)

adrien
adrien

Reputation: 560

If you have no use of the columns with ODBC type -151 (TSQL hierarchid), you can simply add a data type handler on the connection which converts it to a string:

    def HandleHierarchyId(v):
      return str(v)

    conn = pyodbc.connect(connection_string)
    conn.add_output_converter(-151, HandleHierarchyId)

You can also use any other method listed here to convert this type.

Upvotes: 10

Scratch'N'Purr
Scratch'N'Purr

Reputation: 10437

So I'll just create my answer since I now know the complete context of your problem. The issue is related to ODBC driver compatibility issues with the new MS SQL Server 2016. You mentioned that you were able to whittle down your fields to one that had a data type of hierarchyid. Based on the documentation presented here, you can convert it to a nvarchar(4000) string representation. Thus, your solution would be in how you write your query.

Where you have your code currently as:

cursor.execute('SELECT * FROM HumanResources.Employee')

I would modify it to:

cursor.execute("""
SELECT CAST(theHierarchyIdField AS NVARCHAR(4000)) AS myConvertedField
    ,additionalField
    ,...
FROM HumanResources.Employee
""")

I can understand that it would be annoying to explicitly write all the fields in the query that you want to pull, but it's the only way to do the conversion on the SQL side before pulling it into Python as a recognizable data type.

Another solution would be to redesign your table schema and alter the hierarchyid type to nvarchar(4000), but I don't know if you have the rights to alter the table so I'll just propose the above solution.

Also, if you're planning to use pandas to pull in the data, then just set your sql variable to the query string and read_sql:

sql = """
SELECT CAST(theHierarchyIdField AS NVARCHAR(4000)) AS myConvertedField
    ,additionalField
    ,...
FROM HumanResources.Employee
"""

df = pandas.read_sql(sql, cnxn)

Upvotes: 7

Related Questions