Milimetric
Milimetric

Reputation: 13549

Why does newid() materialize at the very end of a query?

If you run the following sample code in SQL Server, you'll notice that newid() materializes after the join whereas row_number() materializes before the join. Does anyone understand this and if there's a way to work around it?

declare @a table ( num varchar(10) )
 insert into @a values ('dan')
 insert into @a values ('dan')
 insert into @a values ('fran')
 insert into @a values ('fran')

 select *
   from @a T
            inner join
        (select num, newid() id
           from @a
          group by num) T1      on T1.num = T.num         

 select *
   from @a T
            inner join
        (select num, row_number() over (order by num) id
           from @a
          group by num) T1      on T1.num = T.num

Upvotes: 1

Views: 521

Answers (2)

user2378068
user2378068

Reputation: 11

i had a similar problem and found out that the "inner join" was the problem. i was able to use "left joins" ...

Upvotes: 1

Aaron Bertrand
Aaron Bertrand

Reputation: 280262

Not sure I see what the problem is here. Materialize the subquery T1 first:

SELECT num, ROW_NUMBER() OVER (ORDER BY num) 
    FROM @a
    GROUP BY num;

You get two rows:

dan  1
fran 2

Now join that against a on num = num, you get 4 rows, 2 for each distinct value. What is your actual goal here? Perhaps you should be applying ROW_NUMBER() outside?

The order of materialization is up to the optimizer. You'll find that other built-ins (RAND(), GETDATE() etc.) have similarly inconsistent materialization behavior. Not much you can do about it, and not much chance they're going to "fix" it.

EDIT

New code sample. Write the contents of @a to a #temp table to "materialize" the NEWID() assignment per unique num value.

SELECT num, id = NEWID() 
  INTO #foo FROM @a GROUP BY num;

SELECT a.num, f.id 
  FROM @a AS a 
  INNER JOIN #foo AS f 
  ON a.num = f.num;

DROP TABLE #foo;

Upvotes: 0

Related Questions