Anestis Kivranoglou
Anestis Kivranoglou

Reputation: 8194

Postgresql 9.3 Update Join Affects different number of rows than Select

I need to select a number of certain positions from my Db and then Update Them.

The awkward thing is that, while executing the Select Statement on its own it wields 90k Rows. But while executing the Update statement on the same Select it affects 80k Rows.

This Yields 90k rows.

SELECT
    "Extent2"."AppId",
    "Extent2"."DateCreated"
FROM
    "dbo"."Route" AS "Extent1"
    INNER JOIN
    "dbo"."Positions" AS "Extent2" ON
        "Extent1"."BoundAppId" = "Extent2"."AppId" AND
        "Extent2"."DateCreated" > "Extent1"."TimeStarted" AND
        "Extent2"."DateCreated" < "Extent1"."TimeFinished"
WHERE
    "Extent1"."TimeStarted" IS NOT NULL AND
    "Extent1"."TimeFinished" IS NOT NULL AND
    FALSE = "Extent1"."IsDeleted" AND
    "Extent1"."TimeFinished" < '2016-02-06'

This Affects 80k rows (The nested Select is Identical)

UPDATE "dbo"."Positions" AS j0
SET "ArchiveUntil" = '2020-02-16'
FROM (
    SELECT
        "Extent2"."AppId",
        "Extent2"."DateCreated"
    FROM
        "dbo"."Route" AS "Extent1"
        INNER JOIN
        "dbo"."Positions" AS "Extent2" ON
            "Extent1"."BoundAppId" = "Extent2"."AppId" AND
            "Extent2"."DateCreated" > "Extent1"."TimeStarted" AND
            "Extent2"."DateCreated" < "Extent1"."TimeFinished"
    WHERE
        "Extent1"."TimeStarted" IS NOT NULL AND
        "Extent1"."TimeFinished" IS NOT NULL AND
        FALSE = "Extent1"."IsDeleted" AND
        "Extent1"."TimeFinished" < '2016-02-06'
) as j1
WHERE (j0."AppId" = j1."AppId" AND j0."DateCreated" = j1."DateCreated")

This SQL was Generated from an EntityFramework.Extended Library Source Commit branch (not merged with parent) but it seems sane to me....

Or maybe its not?

Upvotes: 0

Views: 168

Answers (2)

wildplasser
wildplasser

Reputation: 44250

You are not using any values from the subquery to do the update, you are just checking if records spanning the "dateCreated" exist and (maybe) update it using a constant/literal value.

UPDATE "dbo"."Positions" dst
SET "ArchiveUntil" = '2020-02-16'
WHERE EXISTS (
    SELECT 1
    FROM "dbo"."Route" src
    WHERE dst."BoundAppId" = src."AppId" 
      AND dst."DateCreated" > src."TimeStarted"
      AND dst."DateCreated" < src."TimeFinished"

      -- the two conditions below are redundant
      -- AND src."TimeStarted" IS NOT NULL
      -- AND src."TimeFinished" IS NOT NULL
      AND src."IsDeleted" <> True
      AND src."TimeFinished" < '2016-02-06'
        ) ;

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270713

Your query is ok. I think you have simply discovered that (AppId, DateCreated) is not unique in Positions or that one of the values is NULL.

You can run this query to verify the counts:

select appid, DateCreated, count(*)
from dbo.Positions
group by appid, DateCreated
having count(*) > 1;

It is also possible that some of the values are NULL, which you can find by using:

select p.*
from dbo.Positions p
where appid is null or DateCreated is null;

EDIT:

Oops. I suspect the above is the right idea, just the wrong columns. Try running this version (your original query with aggregation):

SELECT "Extent2"."AppId", "Extent2"."DateCreated", COUNT(*) as cnt
FROM
    "dbo"."Route" AS "Extent1"
    INNER JOIN
    "dbo"."Positions" AS "Extent2" ON
        "Extent1"."BoundAppId" = "Extent2"."AppId" AND
        "Extent2"."DateCreated" > "Extent1"."TimeStarted" AND
        "Extent2"."DateCreated" < "Extent1"."TimeFinished"
WHERE
    "Extent1"."TimeStarted" IS NOT NULL AND
    "Extent1"."TimeFinished" IS NOT NULL AND
    FALSE = "Extent1"."IsDeleted" AND
    "Extent1"."TimeFinished" < '2016-02-06'
GROUP BY "Extent2"."AppId", "Extent2"."DateCreated"
HAVING COUNT(*) >= 2;

I am guessing that this is returning duplicates (so a pair of values gets counted twice). However, a row is only updated once in an update statement (so the same pair is only updated once).

Upvotes: 1

Related Questions