terrah27
terrah27

Reputation: 21

Execute SQL file, return results as Pandas DataFrame

I have a complex SQL Server query that I would like to execute from Python and return the results as a Pandas DataFrame.

My database is read only so I don't have a lot of options like other answers say for making less complex queries.

This answer was helpful, but I keep getting TypeError: 'NoneType' object is not iterable

SQL Example

This is not the real query - just to demonstrate I have temporary tables. Using global temporary tables because my queries failed previously using local temp tables: See this question

SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

IF OBJECT_ID('tempdb..##temptable') IS NOT NULL DROP TABLE ##temptable
IF OBJECT_ID('tempdb..##results') IS NOT NULL DROP TABLE ##results

DECLARE @closing_period int = 0, @starting_period int = 0

Select col1, col2, col3 into ##temptable from readonlytables

Select * into ##results from ##temptable

Select * from ##results

Execute query with pyodbc and pandas

conn = pyodbc.connect('db connection details')

sql = open('myquery.sql', 'r')
df = read_sql_query(sql.read(), conn)
sql.close()
conn.close()

Results - Full Stack Trace

ypeError                                 Traceback (most recent call last)
<ipython-input-38-4fcfe4123667> in <module>
      5 
      6 sql = open('sql/month_end_close_hp.sql', 'r')
----> 7 df = pd.read_sql_query(sql.read(), conn)
      8 #sql.close()
      9 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py in read_sql_query(sql, con, index_col, coerce_float, params, parse_dates, chunksize)
    330         coerce_float=coerce_float,
    331         parse_dates=parse_dates,
--> 332         chunksize=chunksize,
    333     )
    334 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py in read_query(self, sql, index_col, coerce_float, params, parse_dates, chunksize)
   1632         args = _convert_params(sql, params)
   1633         cursor = self.execute(*args)
-> 1634         columns = [col_desc[0] for col_desc in cursor.description]
   1635 
   1636         if chunksize is not None:

TypeError: 'NoneType' object is not iterable

When I run the query in my database I get the expected results. If I pass the query in as a string I also get the expected results:

Query as String

conn = pyodbc.connect('db connection details')

sql = '''
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

IF OBJECT_ID('tempdb..##temptable') IS NOT NULL DROP TABLE ##temptable
IF OBJECT_ID('tempdb..##results') IS NOT NULL DROP TABLE ##results

DECLARE @closing_period int = 0, @starting_period int = 0

Select col1, col2, col3 into ##temptable from readonlytables

Select * into ##results from ##temptable

Select * from ##results
'''

df = read_sql(sql, conn)

conn.close()

I think it might have something to do with the single quotes inside my query?

Upvotes: 1

Views: 3091

Answers (1)

terrah27
terrah27

Reputation: 21

I got it working.

I had to use global variables by replacing @ with @@ I was able to get the query working as expected.

DECLARE @@closing_period int = 0, @@starting_period int = 0

Update: My ODBC driver was very outdated - after updating to the latest version, I no longer needed global temp tables or variables - and the query ran significantly faster.

Upvotes: 1

Related Questions