Reputation: 942
I have three tables:
Table 1: | dbo.pc_a21a22 |
batchNbr Other columns...
-------- ----------------
12345
12346
12347
Table 2: | dbo.outcome |
passageId record
---------- ---------
00003 200
00003 9
00004 7
Table 3: | dbo.passage |
passageId passageTime batchNbr
---------- ------------- ---------
00001 2015.01.01 12345
00002 2016.01.01 12345
00003 2017.01.01 12345
00004 2018.01.01 12346
What I want to do: for each batchNbr in Table 1 get first its latest passageTime and the corresponding passageID from Table 3. With that passageID, get the relevant rows in Table 2 and establish whether any of these rows contains the record 200. Per passageId there are at most 2 records in Table 2
What is the most efficient way to do this?
I have already created a query that works, but it's awfully slow and thus unfit for tables with millions of rows. Any suggestion on how to either change the query or do it another way? Altering the table structure is not an option, I only have read rights to the database.
My current solution (slow):
SELECT TOP 50000
a.batchNbr,
CAST ( CASE WHEN 200 in (SELECT TOP 2 record FROM dbo.outcome where passageId in (
SELECT SubqueryResults.passageId From (SELECT Top 1 passageId FROM dbo.passage pass WHERE pass.batchNbr = a.batchNbr ORDER BY passageTime Desc) SubqueryResults
)
) then 1 else 0 end as bit) as KGT_IO_END
FROM dbo.pc_a21a22 a
The desired output is:
batchNbr 200present
--------- ----------
12345 1
12346 0
Upvotes: 1
Views: 503
Reputation: 1079
Try this
SELECT TOP 50000 a.*, (CASE WHEN b.record = 200 THEN 1 ELSE 0 END) AS
KGT_IO_END
FROM dbo.Test1 AS a
LEFT OUTER JOIN
(SELECT record, p.batchNbr
FROM dbo.Test2 AS o
LEFT OUTER JOIN (SELECT MAX(passageId) AS passageId, batchNbr FROM
dbo.Test3 GROUP BY batchNbr) AS p ON o.passageId = p.passageId
) AS b ON a.batchNbr = b.batchNbr;
The MAX
subquery is to get the latest passageId
by batchNbr
.
However, your example won't get the record 200, since the passageId
of the record with 200 is 00001, while the latest passageId
of the batchNbr
12345 is 00003.
I used LEFT OUTER JOIN
since the passageId
from Table2 no longer match any of the latest passageId
from Table3. The resulting subquery would have no records to join
to Table1. Therefore INNER JOIN
would not show any records from your example data.
Output from your example data:
batchNbr KGT_IO_END
12345 0
12346 0
12347 0
Output if we change the passageId
of record 200 to 00003 (the latest for 12345)
batchNbr KGT_IO_END
12345 1
12346 0
12347 0
Upvotes: 1
Reputation: 339
I suggest you use table joining rather than subqueries.
select
a.*, b.*
from
dbo.table1 a
join
dbo.table2 b on a.id = b.id
where
/*your where clause for filtering*/
EDIT:
You could use this as a reference Join vs. sub-query
Upvotes: 3