sactiw
sactiw

Reputation: 22461

UPDATE using INNER JOIN on SUBQUERY not working as expected

I am trying to UPDATE col1 in tableA taking data from tableD using this query (prototype):

UPDATE tableA
SET TableA.col1 = tableB.col1
FROM tableA 
INNER JOIN
    (SELECT tableC.col1, id 
     FROM
         (SELECT 
              ROW_NUMBER() OVER(PARTITION BY tableD.col1 ORDER BY (tableD.someCol desc) AS "Row Number", 
              col1, id 
          FROM tableD) tableC
     WHERE [Row Number] = 1) tableB ON tableA.id = tableB.id
WHERE 
    some_clause_on_tableA_and_tableB;

But unfortunately this doesn't works properly (incorrect updates), but if I explicitly store data of tableB in a #temp table and then try to update tableA from that #temp table then it works fine (prototype query):

SELECT * 
INTO #temp 
FROM
    (SELECT 
         ROW_NUMBER() OVER(PARTITION BY tableD.col1 ORDER BY (tableD.someCol desc) AS "Row Number", 
         col1, id 
     FROM tableD) tableC
WHERE
    [Row Number]=1;

UPDATE tableA
SET TableA.col1 = tableB.col1
FROM tableA 
INNER JOIN #temp tableB ON tableA.id = tableB.id
WHERE some_clause_on_tableA_and_tableB;

I am not sure why the later works but first doesn't. It seems that in first query the two nested subqueries (with row ranking) are not able to fetch correct data but I may be wrong. Any help would be greatly appreciated!!!

Thanks, Sachin

Upvotes: 0

Views: 5143

Answers (2)

Andriy M
Andriy M

Reputation: 77677

There must be something not covered by the examples. I mean, the two methods presented are equivalent to each other. But if your real query looks more complex, than that may somehow prevent you from seeing the real issue.

To help yourself to spot the issue sooner, you could simplify your query by getting rid of one level of nesting: the [Row Number]=1 condition could be moved to the main query, like this:

UPDATE tableA
    SET TableA.col1 = tableB.col1
FROM tableA INNER JOIN (
    SELECT
        ROW_NUMBER() OVER(PARTITION BY tableD.col1 ORDER BY (tableD.someCol desc) AS "Row Number",
        col1,
        id
    FROM tableD
) tableB
ON tableA.id = tableB.id
WHERE some_clause_on_tableA_and_tableB
  AND tableB.[Row Number] = 1;

Furthermore, since you are not really assigning tableA a new alias, you could omit it from the FROM clause of the UPDATE statement, like this:

UPDATE tableA
    SET TableA.col1 = tableB.col1
FROM (
    SELECT
        ROW_NUMBER() OVER(PARTITION BY tableD.col1 ORDER BY (tableD.someCol desc) AS "Row Number",
        col1,
        id
    FROM tableD
) tableB
ON tableA.id = tableB.id
WHERE some_clause_on_tableA_and_tableB
  AND tableB.[Row Number] = 1;

There's no implication to omitting the target table from the FROM clause, and yet the query itself might become a tiny bit cleaner and thus easier to spot a potential issue in.

Alternatively, as suggested by @mouters, you could rewrite the subselect as a common table expression (CTE) or, in fact, a series thereof if, for instance, tableD in reality is a subquery too. Consider this:

WITH tableB AS (
    SELECT
        ROW_NUMBER() OVER(PARTITION BY tableD.col1 ORDER BY (tableD.someCol desc) AS "Row Number",
        col1,
        id
    FROM tableD
)
UPDATE tableA
    SET TableA.col1 = tableB.col1
FROM tableA INNER JOIN tableB  /* or just: FROM tableB */
ON tableA.id = tableB.id
WHERE some_clause_on_tableA_and_tableB
  AND tableB.[Row Number] = 1;

Again, there's no implications to rewriting the subselect as a CTE (not in this particular case, since you are only referencing the CTE once), but this, in its own way, may (or may not) be instrumental in helping you to find the cause of the problem.

Upvotes: 0

Chris Moutray
Chris Moutray

Reputation: 18379

I get the feeling your probably missing the OVER clause making use of partition and order by keywords to group your tableA id's referenced in tableD. So basically what I think you're trying to do is update each entry found in tableA with the first row found in tableD for each of tableA's id's.

Perhaps try something like this as your tableC subquery

SELECT
    ROW_NUMBER() OVER (PARTITION BY tableD.fk_to_tableA order by tableD.something
    ,col1 FROM tableD

EDIT following comment

For what its worth here's a simple example of the query you're trying and it works fine. In what way does this example differ from your problem?

declare @entity table (id int, latestEventId int)

declare @event table (id int, entityId int)

insert into @entity values (1, null), (2, null), (3, null), (4, null), (5, null)

insert into @event values (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 1), (7, 2), (8, 3), (9, 4), (10, 5), (11, 1), (12, 2), (13, 3), (14, 4), (15, 5)

update tableA
    set tableA.latestEventId = tableB.id
from 
    @entity tableA
inner join
(
    select
        * 
    from 
    (
        select 
            *,
            rn = ROW_NUMBER() over (partition by e.entityId order by e.id desc)
        from
            @event e
    ) tableC
    where
        rn = 1
) tableB 
on tableB.entityId = tableA.id

-- expect the result of entity id's 1 to 5 with the values 11 to 15
select * from @entity

Upvotes: 2

Related Questions