Trevor
Trevor

Reputation: 1145

MS SQL Python Datetime later than but not equal '>' returning result as '>='

I am trying to query a Microsoft SQL database to get all records after the datetime returned in the previous loop. I use the '>' operator but the result I am seeing is for '>=' (greater than or equal).

Python (3.6.5 Win10) script:

import pyodbc  # version 4.0.24
import time
import datetime
import json
con = pyodbc.connect("DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=mydb;UID=myuid;PWD=mypwd")
cur = con.cursor()
i = 0
lastDatetime = datetime.datetime.strptime('01Jan1970', '%d%b%Y')
while i < 60:
    print("{0}: SELECT * FROM dbo.Data WHERE datetime > '{1}'".format(str(i), lastDatetime))
    cur.execute("SELECT * FROM dbo.Data WHERE datetime > ?", lastDatetime)
    rows = cur.fetchall()
    for k, v in enumerate(rows):
        jsonMsg = json.dumps({
            "transaction_id": v[1],
            "plu": v[2].strip(),
            "qty": int(v[3]),
            "dateTime": str(v[4])
        }, separators=(',', ':'))
        print(str(jsonMsg))
        lastDatetime = v[6]
        print("lastDatetime set to: {0}".format(lastDatetime))
    i = i + 1
    time.sleep(5)

Produces:

0: SELECT * FROM dbo.Data WHERE datetime > '1970-01-01 00:00:00'
{"transaction_id":3201,"plu":"3","qty":1,"dateTime":"2018-08-01 10:45:40.560000"}
lastDatetime set to: 2018-08-01 10:45:40.560000
{"transaction_id":3202,"plu":"5","qty":1,"dateTime":"2018-08-01 10:45:51.563000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
{"transaction_id":3230,"plu":"8","qty":2,"dateTime":"2018-08-01 10:48:47.653000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
1: SELECT * FROM dbo.Data WHERE datetime > '2018-08-01 10:48:47.653000'
{"transaction_id":3230,"plu":"8","qty":2,"dateTime":"2018-08-01 10:48:47.653000"}
lastDatetime set to: 2018-08-01 10:48:47.653000
2: SELECT * FROM dbo.Data WHERE datetime > '2018-08-01 10:48:47.653000'
{"transaction_id":3230,"plu":"8","qty":2,"dateTime":"2018-08-01 10:48:47.653000"}
lastDatetime set to: 2018-08-01 10:48:47.653000

I run the query in SQL Server Management Studio and I get the expected result:

SELECT * FROM dbo.Data WHERE datetime > '2018-08-01 10:48:47.653' 

returns no records.

What am I missing or what should I change in the python script to get the expected result?

Thanks.

EDITS: Add the '' to the print of the sql string around the datatime. Didn't make a difference to the result.

Upvotes: 2

Views: 1540

Answers (2)

David Browne - Microsoft
David Browne - Microsoft

Reputation: 89361

You are hitting the breaking change for SQL 2016+ noted here:

Under database compatibility level 130, implicit conversions from datetime to datetime2 data types show improved accuracy by accounting for the fractional milliseconds, resulting in different converted values. Use explicit casting to datetime2 datatype whenever a mixed comparison scenario between datetime and datetime2 datatypes exists. For more information, refer to this Microsoft Support Article.

Breaking Changes in SQL Server 2016

The problem is that your table is still using the old datetime data type. And it's a really weird type. It doesn't have ms resolution. Its resolution is actually 1/300th of a second. So the actual stored value is not always quite what it looks like. The storage is inexact, like for floating point numbers.

The new better datetime2 has more precision and always stores decimal fractional seconds. Anyway SQL 2016 changed how datetime gets converted to datetime2, which has a side-effect of making comparisons involving implicit conversions sometimes break.

To fix this either change the data type on your table to datetime2(3) for 1ms precision. Or force the parameter to be a datetime, instead of a datetime2. I don't know if pyodbc allows that, but you can force the conversion in the query easily:

cur.execute("SELECT * FROM dbo.Data WHERE datetime >= cast(? as datetime)", lastDatetime)

Now instead of converting the table column to datetime2 you convert the parameter value to datetime, and the comparison works as expected.

Or reduce your database compatibility level to 120.

Upvotes: 3

Trevor
Trevor

Reputation: 1145

I've solved it by adding 1 millisecond to the last time along with change the '>' to a '>='. As the millisecond is the smallest increment on time in MS SQL at the moment, this method should not miss any data.

I suspect it has something to do with pythons handling of floats but cannot be sure as adding 1msec as a float to the seconds part didn't quite increment it enough.

Revised code:

import pyodbc  # version 4.0.24
import time
import datetime
import json
con = pyodbc.connect("DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=mydb;UID=myuid;PWD=mypwd")
cur = con.cursor()
i = 0
lastDatetime = datetime.datetime.strptime('01Jan1970', '%d%b%Y')
while i < 60:
    print("{0}: SELECT * FROM dbo.Data WHERE datetime >= '{1}'".format(str(i), lastDatetime))
    cur.execute("SELECT * FROM dbo.Data WHERE datetime >= ?", lastDatetime)
    rows = cur.fetchall()
    for k, v in enumerate(rows):
        jsonMsg = json.dumps({
            "transaction_id": v[1],
            "plu": v[2].strip(),
            "qty": int(v[3]),
            "dateTime": str(v[4])
        }, separators=(',', ':'))
        print(str(jsonMsg))
        lastDatetime = v[6]
        # work around a bug with datetime > returning as if >=
        lastDatetime = lastDatetime + datetime.timedelta(milliseconds=1)
        print("lastDatetime set to: {0}".format(lastDatetime))
    i = i + 1
    time.sleep(5)

Upvotes: -1

Related Questions