Reputation: 1145
I am trying to query a Microsoft SQL database to get all records after the datetime returned in the previous loop. I use the '>' operator but the result I am seeing is for '>=' (greater than or equal).
Python (3.6.5 Win10) script:
import pyodbc # version 4.0.24
import time
import datetime
import json
con = pyodbc.connect("DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=mydb;UID=myuid;PWD=mypwd")
cur = con.cursor()
i = 0
lastDatetime = datetime.datetime.strptime('01Jan1970', '%d%b%Y')
while i < 60:
print("{0}: SELECT * FROM dbo.Data WHERE datetime > '{1}'".format(str(i), lastDatetime))
cur.execute("SELECT * FROM dbo.Data WHERE datetime > ?", lastDatetime)
rows = cur.fetchall()
for k, v in enumerate(rows):
jsonMsg = json.dumps({
"transaction_id": v[1],
"plu": v[2].strip(),
"qty": int(v[3]),
"dateTime": str(v[4])
}, separators=(',', ':'))
print(str(jsonMsg))
lastDatetime = v[6]
print("lastDatetime set to: {0}".format(lastDatetime))
i = i + 1
time.sleep(5)
Produces:
0: SELECT * FROM dbo.Data WHERE datetime > '1970-01-01 00:00:00'
{"transaction_id":3201,"plu":"3","qty":1,"dateTime":"2018-08-01 10:45:40.560000"}
lastDatetime set to: 2018-08-01 10:45:40.560000
{"transaction_id":3202,"plu":"5","qty":1,"dateTime":"2018-08-01 10:45:51.563000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
{"transaction_id":3230,"plu":"8","qty":2,"dateTime":"2018-08-01 10:48:47.653000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
1: SELECT * FROM dbo.Data WHERE datetime > '2018-08-01 10:48:47.653000'
{"transaction_id":3230,"plu":"8","qty":2,"dateTime":"2018-08-01 10:48:47.653000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
2: SELECT * FROM dbo.Data WHERE datetime > '2018-08-01 10:48:47.653000'
{"transaction_id":3230,"plu":"8","qty":2,"dateTime":"2018-08-01 10:48:47.653000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
I run the query in SQL Server Management Studio and I get the expected result:
SELECT * FROM dbo.Data WHERE datetime > '2018-08-01 10:48:47.653'
returns no records.
What am I missing or what should I change in the python script to get the expected result?
Thanks.
EDITS: Add the '' to the print of the sql string around the datatime. Didn't make a difference to the result.
Upvotes: 2
Views: 1540
Reputation: 89361
You are hitting the breaking change for SQL 2016+ noted here:
Under database compatibility level 130, implicit conversions from datetime to datetime2 data types show improved accuracy by accounting for the fractional milliseconds, resulting in different converted values. Use explicit casting to datetime2 datatype whenever a mixed comparison scenario between datetime and datetime2 datatypes exists. For more information, refer to this Microsoft Support Article.
Breaking Changes in SQL Server 2016
The problem is that your table is still using the old datetime
data type. And it's a really weird type. It doesn't have ms resolution. Its resolution is actually 1/300th of a second. So the actual stored value is not always quite what it looks like. The storage is inexact, like for floating point numbers.
The new better datetime2
has more precision and always stores decimal fractional seconds. Anyway SQL 2016 changed how datetime
gets converted to datetime2
, which has a side-effect of making comparisons involving implicit conversions sometimes break.
To fix this either change the data type on your table to datetime2(3) for 1ms precision. Or force the parameter to be a datetime, instead of a datetime2. I don't know if pyodbc allows that, but you can force the conversion in the query easily:
cur.execute("SELECT * FROM dbo.Data WHERE datetime >= cast(? as datetime)", lastDatetime)
Now instead of converting the table column to datetime2
you convert the parameter value to datetime
, and the comparison works as expected.
Or reduce your database compatibility level to 120.
Upvotes: 3
Reputation: 1145
I've solved it by adding 1 millisecond to the last time along with change the '>' to a '>='. As the millisecond is the smallest increment on time in MS SQL at the moment, this method should not miss any data.
I suspect it has something to do with pythons handling of floats but cannot be sure as adding 1msec as a float to the seconds part didn't quite increment it enough.
Revised code:
import pyodbc # version 4.0.24
import time
import datetime
import json
con = pyodbc.connect("DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=mydb;UID=myuid;PWD=mypwd")
cur = con.cursor()
i = 0
lastDatetime = datetime.datetime.strptime('01Jan1970', '%d%b%Y')
while i < 60:
print("{0}: SELECT * FROM dbo.Data WHERE datetime >= '{1}'".format(str(i), lastDatetime))
cur.execute("SELECT * FROM dbo.Data WHERE datetime >= ?", lastDatetime)
rows = cur.fetchall()
for k, v in enumerate(rows):
jsonMsg = json.dumps({
"transaction_id": v[1],
"plu": v[2].strip(),
"qty": int(v[3]),
"dateTime": str(v[4])
}, separators=(',', ':'))
print(str(jsonMsg))
lastDatetime = v[6]
# work around a bug with datetime > returning as if >=
lastDatetime = lastDatetime + datetime.timedelta(milliseconds=1)
print("lastDatetime set to: {0}".format(lastDatetime))
i = i + 1
time.sleep(5)
Upvotes: -1