Reputation: 1644
Can someone explain to me what is happening here. The behavior is not as I expect:
select category,timefield from test where category='A'
union
select top 10 category,timefield from test where category='B' order by timefield desc
What I want is all rows of category A and only the 10 most recent from category B.
When executed, the results are actually: All of category A (sorted by timefield descending) + the first 10 rows of category B (sorted by timefield descending)
Stranger still, if we take this whole statement and place it in a subquery, I would expect the behavior to remain the same and evaluated as before, but it is not:
select * from (
select category,timefield from test where category='A'
union
select top 10 category,timefield from test where category='B' order by timefield desc
) subq
This returns all of category A (in ascending order) + the 10 most recent of category B (in descending order). Which is exactly what I want.
Why does the first statement not produce what I expect and why does the second behave very differently ??
(This is SQL Server 2008)
Upvotes: 2
Views: 114
Reputation: 1269673
You need parentheses in the first query, because the order by
is applied to the result of the union
. It is interpreted as:
(select category,timefield from test where category='A'
union
select top 10 category,timefield from test where category='B'
)
order by timefield desc
(I'm not saying this is valid syntax.)
whereas what you want is:
(select category, timefield from test where category='A')
union
(select top 10 category, timefield from test where category='B' order by timefield desc)
Do note that union
will remove duplicates. If you don't want this additional overhead, use union all
instead.
As to why this works as a subquery, my guess is that it is a coincidence. SQL Server is not going to guarantee the results being returned when you use top
without an order by
-- or even that the results are consistent from one call to the next. Sometimes it might actually do what you want.
Upvotes: 5