Reputation: 3288
Below is a dummy example to demonstrate how the query results are different, the real query is more more complex so the query structure may seem overkill in this example. Set up a connection to a sqlite3 database and add these records to start with:
import sqlite3
connection = sqlite3.connect(
'file:test_database',
detect_types=sqlite3.PARSE_DECLTYPES,
isolation_level=None,
check_same_thread=False,
uri=True
)
cursor = connection.cursor()
tableA_records = [(1, 202003), (2, 202003), (3, 202003), (4, 202004), (5, 202004), (6, 202004), (7, 202004), (8, 202004), ]
tableB_records = [(1, 202004), (2, 202004), (3, 202004), (4, 202004), (5, 202004),]
tableA_ddl = """
create table tableA
(
ID int,
RunYearMonth int
);
"""
tableB_ddl = """
create table tableB
(
ID int,
RunYearMonth int
);
"""
cursor.execute(tableA_ddl)
cursor.execute(tableB_ddl)
cursor.executemany("INSERT INTO tableA VALUES (?, ?)", tableA_records)
cursor.executemany("INSERT INTO tableB VALUES (?, ?)", tableB_records)
Now we have two tables (A and B) with 8 and 5 records, respectively. I want to count records that have the same ID and date between the two when the date is 202004
.
I have this query now:
SELECT COUNT(*)
FROM (
SELECT *
FROM `tableA`
WHERE `RunYearMonth` = 202004
) AS `A`
INNER JOIN (
SELECT *
FROM `tableB`
WHERE `RunYearMonth` = 202004
) AS `B`
ON `A`.`ID` = `B`.`ID`
AND `A`.`RunYearMonth` = `B`.`RunYearMonth`
This, as expected, returns 2
when run in a sqlite console.
When run in Python, though, you get a different result.
q = """
SELECT COUNT(*)
FROM (
SELECT *
FROM `tableA`
WHERE `RunYearMonth` = 202004
) AS `map1`
INNER JOIN (
SELECT *
FROM `tableB`
WHERE `RunYearMonth` = 202004
) AS `map2`
ON `map1`.`ID` = `map2`.`ID`
AND `map1`.`RunYearMonth` = `map2`.`RunYearMonth`
"""
cursor.execute(q)
print(cursor.fetchall())
This returns instead 5
which effectively ignores the WHERE
clauses in the subqueries and the join condition they have the same RunYearMonth
, there are records 1-5 in both.
What would cause this difference? Does Python not simply pass the query string through?
Pertinent versions:
sqlite3.version == 2.6.0
sqlite3.sqlite_version == 3.31.1
sys.version == 3.6.5
Upvotes: 0
Views: 202
Reputation: 52374
I created a test database using your first script, and then opened it in a sqlite3 shell. Your query returns 5 in it, not the 2 you're getting. After changing it to show all the rows, not just the count, it results in:
ID RunYearMonth ID RunYearMonth
---------- ------------ ---------- ------------
1 202003 1 202004
2 202003 2 202004
3 202003 3 202004
4 202004 4 202004
5 202004 5 202004
I'm not sure why those rows from tableA with RunYearMonth of 202003 are getting included; I'd think they would be filtered out by the subquery's WHERE
.
This appears to be a bug in Sqlite3 - using an older version (3.11.0) gives the expected results, and a slight tweak to the query to remove the AND map1.RunYearMonth = map2.RunYearMonth
produces the correct results on 3.31.1.
Anyways, that query can be cleaned up significantly, like so:
SELECT count(*)
FROM tableA AS A
JOIN tableB AS B ON A.ID = B.ID
AND A.RunYearMonth = B.RunYearMonth
WHERE A.RunYearMonth = 202004;
which does return the expected count of 2.
Upvotes: 1