Reputation: 4942
I am just learning SQL and I'm really struggling to understand why my left join is returning duplicate rows. This is the query I'm using:
SELECT "id", "title"
FROM "posts"
LEFT JOIN "comments" "comment"
ON "comment"."post_id"="id" AND ("comment"."status" = 'hidden')
It returns 4 rows, but should only return 3. Two of the returned rows contain are duplicate (same values). I can fix this by using the DISTINCT
prefix on "id"
.
SELECT DISTINCT "id", "title"
FROM "posts"
LEFT JOIN "comments" "comment"
ON "comment"."post_id"="id" AND ("comment"."status" = 'hidden')
The query returns 3 rows and I get desired result. But I'm still wondering why in the world I would get a duplicate row from the first query in the first place? I'm trying to write an aggregation query and this seems to be the issue I'm having.
I'm using PostgreSQL.
More specific: (as created by my ORM)
Shift DDL
CREATE TABLE shift (
id uuid DEFAULT uuid_generate_v4() PRIMARY KEY,
"gigId" uuid REFERENCES gig(id) ON DELETE CASCADE,
"categoryId" uuid REFERENCES category(id),
notes text,
"createdAt" timestamp without time zone NOT NULL DEFAULT now(),
"updatedAt" timestamp without time zone NOT NULL DEFAULT now(),
"salaryFixed" numeric,
"salaryHourly" numeric,
"salaryCurrency" character varying(3) DEFAULT 'SEK'::character varying,
"staffingMethod" character varying(255) NOT NULL DEFAULT 'auto'::character varying,
"staffingIspublished" boolean NOT NULL DEFAULT false,
"staffingActivateon" timestamp with time zone,
"staffingTarget" integer NOT NULL DEFAULT 0
);
ShiftEmployee DDL
CREATE TABLE "shiftEmployee" (
"employeeId" uuid REFERENCES employee(id) ON DELETE CASCADE,
"shiftId" uuid REFERENCES shift(id) ON DELETE CASCADE,
status character varying(255) NOT NULL,
"updatedAt" timestamp without time zone NOT NULL DEFAULT now(),
"salaryFixed" numeric,
"salaryHourly" numeric,
"salaryCurrency" character varying(3) DEFAULT 'SEK'::character varying,
CONSTRAINT "PK_6acfd2e8f947cee5a62ebff08a5" PRIMARY KEY ("employeeId", "shiftId")
);
Query
SELECT "id", "staffingTarget" FROM "shift" LEFT JOIN "shiftEmployee" "se" ON "se"."shiftId"="id" AND ("se"."status" = 'confirmed');
Result
id staffingTarget
68bb0892-9bce-4d08-b40e-757cb0889e87 3
12d88ff7-9144-469f-8de5-3e316c4b3bbd 6
73c65656-e028-4f97-b855-43b00f953c7b 5
68bb0892-9bce-4d08-b40e-757cb0889e88 3
e3279b37-2ba5-4f1d-b896-70085f2ba345 4
e3279b37-2ba5-4f1d-b896-70085f2ba346 5
e3279b37-2ba5-4f1d-b896-70085f2ba346 5
789bd2fb-3915-4cda-a3d7-2186cf5bb01a 3
Upvotes: 0
Views: 1111
Reputation:
If a post has more than one hidden comment, you will see that post multiple times because a join returns one row for each match - that's the nature of a join. And an outer join doesn't behave differently.
If your intention is to list only posts with hidden comments, it's better to use an EXISTS query instead:
SELECT p.id, p.title
FROM posts p
where exists (select *
from comments c
where c.post_id = p.id
and c.status = 'hidden');
Upvotes: 2