Reputation: 878
I'm writing a stored procedure to populate a multi-select control and I want to pull back at least 100 options from a database table, but I want to ensure that I bring back all the options the user has already selected.
The existing query brought back only the first 100 rows:
SELECT TOP 100
t2.Id,
t2.value2
FROM
table1 t1
INNER JOIN
table2 t2
ON
t1.clientId = t2.clientId AND t1.Id = t2.Id
WHERE
t1.clientId = 1
I've added a UNION
to this query to ensure I always get the selected options back as well:
SELECT TOP 100
t2.Id,
t2.value2
FROM
table1 t1
INNER JOIN
table2 t2 ON
t1.clientId = t2.clientId AND t1.Id = t2.Id
WHERE
t1.clientId = 1
UNION
SELECT t2.id,
t_si.externalCertificateNumber
FROM table2 t2
INNER JOIN
Table3 t3
ON t2.value = 333
In one of my test examples, there were 4 previously selected options. 3 of these would have returned in the top 100 (1 was not, hence the need to update).
I expected to get back 101 rows and was surprised to get back 104. I figured this was down to the way the DISTINCT
inherent in UNION
is working.
However what I am also seeing is different results in the 100 returned. With some, (more than 3) different values showing up. And some that were gone in the previous TOP 100
!
So my question is, does UNION
have some kind of in built sort that would be effecting the results? Or is there something else going on here?
-----------------------------EDIT------------------------------
If I changed the above queries to TOP 5:
And the previously selected values are:
1, Blue
2, Green
20, Violet
100 Indigo
The result for the SELECT on its own would look like this:
1, Blue
2, Green
7, Red
15, Cyan
20, Violet
The result for the SELECT and the UNION
1, Blue
2, Green
33, Cyan
20, Violet
24, Yellow
100, Indigo
60, Aero
40, Amber
25, Black
This is a made up example not from the DB, but note that the Red in the SELECT is missing from the SELECT and UNION
Upvotes: 1
Views: 592
Reputation: 32703
I'm starting to suspect that adding the
UNION
has changed the execution plan, and since there is no order explicitly set, that this is affecting what's coming back.
You are correct.
TOP 100
without ORDER BY
returns some 100 rows. Which rows are returned is not defined.
The first version of the query without UNION
returned one set of 100 rows, the second version of the query with UNION
returned another set of 100 rows.
To get predictable and expected result you should add ORDER BY
to the query that uses TOP
. Ideally the sorting should be unambiguous.
Yes, UNION
does have a "built-in" sort. If you look at the execution plan you would usually see it.
I prefer to use Common-Table Expressions (CTE) to write complex queries, like this:
WITH
CTE
AS
(
SELECT TOP 100
t2.Id,
t2.value2
FROM
table1 t1
INNER JOIN table2 t2
ON t1.clientId = t2.clientId
AND t1.Id = t2.Id
WHERE
t1.clientId = 1
ORDER BY
t2.Id
)
SELECT
Id,
value2
FROM
CTE
UNION
SELECT
t2.id,
t_si.externalCertificateNumber
FROM
table2 t2
INNER JOIN Table3 t3 ON t2.value = 333
;
Upvotes: 2
Reputation: 522817
The UNION
operator removes duplicate records. To retain all records use UNION ALL
:
SELECT TOP 100
t2.Id,
t2.value2
FROM
table1 t1
INNER JOIN
table2 t2 ON
t1.clientId = t2.clientId AND
t1.Id = t2.Id
WHERE
t1.clientId = 1
UNION ALL
SELECT t2.id, t_si.externalCertificateNumber FROM table2 t2
INNER JOIN Table3 t3
ON t2.value = 333
The missing records from the TOP 100
query can easily be explained by the UNION
, and also the differing number of records in your result set.
Upvotes: 2