AllmanTool
AllmanTool

Reputation: 1554

TSQL CTE's (common table expression) bug

I'm working with Microsoft SQL Server 2008 (SP3) - 10.0.5520.0 (X64) Standard Edition (64-bit) on Windows NT 6.0 (Build 6002: Service Pack 2)

Description:

  1. [ORC_RESULT] and GROUPING_SELECT are CTE table.

I have strange bug with CTE (common table expression).

SELECT a.keysbor
--,gs.[rank]
FROM [ORC_RESULT] AS a inner JOIN GROUPING_SELECT AS gs
    ON a.keysbor = gs.keysbor
WHERE gs.[RANK] = 1
ORDER BY a.keysbor

result 36 rows (no duplicate rows)

SELECT a.keysbor
,gs.[rank]
FROM [ORC_RESULT] AS a inner JOIN GROUPING_SELECT AS gs
    ON a.keysbor = gs.keysbor
WHERE gs.[RANK] = 1 
ORDER BY a.keysbor

result 29 rows

Q: Why filter gs.[RANK] = 1 have applied only when field gs.[rank] exists in SELECT statement?

without gs.[rank]   with gs.[rank]  rank
10072002992443  10072002992443       1
10072002992444  10072002992444       1
10072002992445  10072002992445       1
10072002992446  10072002992446       1
10072002992447  10072002992447       1
10072002992448  10072002992448       1
10072002992449  10072002992449       1
10072002992450  10072002992450       1
10072002992451  10072002992451       1
10072002992452  10072002992452       1
10072002992453  10072002992453       1
10072002992454  10072002992454       1
10072002992455  10072002992455       1
10072002992456  10072002992456       1
10072002992457  10072002992457       1
10072002992458  10072002992458       1
10072002992459  10072002992459       1
10072002992460  10072002992460       1
10072002992461                       2
10072002992462                       2
10072002992463                       2
10072002992464                       2
10072002992465                       2
10072002992466                       2
10072002992467                       2
10072002992736  10072002992736       1
10072002992866  10072002992866       1
10072002992867  10072002992867       1
10072002992868  10072002992868       1
10072002992869  10072002992869       1
10072002992870  10072002992870       1
10072002992871  10072002992871       1
10072002992872  10072002992872       1
10072002992873  10072002992873       1
10072002992874  10072002992874       1
10072002992875  10072002992875       1

Upvotes: 3

Views: 334

Answers (1)

sepupic
sepupic

Reputation: 8697

I think your [rank] is not calculated in determinate way, i.e.

PARTITION BY A.[KEYKRT], A.[VIDSBR],A.[ORC_ID_ED] ORDER BY A.[ORC_ID_ED] 

does not uniqely determine the order, so depending on plan operators, row_number() can assign 1 to different rows. I give you one example where the plan is the same, but the order of rows proceeded is different, and with the same input data you'll get 2 different results. Depending on what columns you put in SELECT, optimizer builds different plan and your RANK is equal to 1 in different rows (corresponding to different keysbor), so when it comes to join on keysbor, different number of rows are returned. So in both cases the filter rank = 1 is applyed, but 1 is assigned to different rows.

Here is the example:

    declare @t table( a int, b int, c int, primary key(c desc, b)); /*run it as it is, then uncomment PK and run again*/
    insert into @t values 
    (1,1,1), (1,1,2), (1,1,3),
    (1,2,1), (1,2,2), (1,2,3);

    declare @t1 table( c int);
    insert @t1 values (1);

    with cte as (
    select row_number() over (partition by a, b order by b) as rn,
           a,
           b,
           c
    from @t 
    group by a, b, c
    )

    select *
    from cte c join @t1 t on c.c = t.c
    where rn = 1;

As you can see, I always have the same input, but in the first case I don't declare Primary Key and in the second I do. Rows returned are different, but it does not mean that filter do not work, it just means that in the first case rn = 1 was attributed to 2 rows with c=1, but in the second case rn = 1 have two rows with c=1 and c=3.

Upvotes: 2

Related Questions