Alex Herman
Alex Herman

Reputation: 2848

shape 2 complex GROUP BY / PARTITION BY queries into 1 query

I was trying to solve a task with just 1 query using GROUP BY, but ended up having a complex solution using WITH clause - so I have 2 queries combined. But I suspect I tend to over-engineer it.

Say, we have 3 tables:

(see the whole SQL for the tables' layout under section schema below)

For example, the task reads

find authors who have written books on more than 2 subjects

During my research I have come up with complex, over-engineered 2 solutions:

solution 1 - using PARTITION BY clause work out the row number

WITH temp1 AS 
(
    SELECT author_id, subject_id, ROW_NUMBER () OVER (PARTITION BY author_id) as rn
    FROM books
    GROUP BY author_id, subject_id
)
SELECT DISTINCT author_id FROM temp1 WHERE rn > 2
;

solution 2 - combining 2 queries containing GROUP BY clause

WITH temp1 AS 
(
    SELECT author_id, subject_id
    FROM books
    GROUP BY author_id, subject_id
)
SELECT DISTINCT author_id, COUNT(*) FROM temp1
GROUP BY author_id
HAVING COUNT(*) > 2
;

Is there any way to solve such a task with just 1 query?


Schema

The SQL below applies to PostgreSQL

CREATE SEQUENCE authors_id_seq;

CREATE TABLE authors
(
    id bigint NOT NULL DEFAULT nextval('authors_id_seq'::regclass),
    name TEXT NOT NULL,
    CONSTRAINT pk_authors PRIMARY KEY (id)
);

CREATE SEQUENCE subjects_id_seq;

CREATE TABLE subjects
(
    id bigint NOT NULL DEFAULT nextval('subjects_id_seq'::regclass),
    name TEXT NOT NULL,
    CONSTRAINT pk_subjects PRIMARY KEY (id)
);

CREATE SEQUENCE books_id_seq;

CREATE TABLE books
(
    id bigint NOT NULL DEFAULT nextval('books_id_seq'::regclass),
    name TEXT NOT NULL,
    popularity int NOT NULL,
    author_id bigint NOT NULL,
    subject_id bigint NOT NULL,
    CONSTRAINT pk_books PRIMARY KEY (id),
    CONSTRAINT fk_books_authors_author_id FOREIGN KEY (author_id)
        REFERENCES authors (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE CASCADE,
    CONSTRAINT fk_books_subjects_subject_id FOREIGN KEY (subject_id)
        REFERENCES subjects (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE CASCADE
);

seed data:

INSERT INTO authors (name) VALUES 
('John Doe'),
('Friedrich Nietzsche');
INSERT INTO subjects (name) VALUES 
('anatomy'),
('geography'),
('nature');
INSERT INTO books (name, popularity, author_id, subject_id) VALUES
('body', 5, 1, 1),
('earth', 10, 1, 2),
('ocean', 9, 1, 2),
('animals', 9, 1, 3),
('birds', 7, 2, 3);

Upvotes: 0

Views: 71

Answers (1)

Zaynul Abadin Tuhin
Zaynul Abadin Tuhin

Reputation: 31991

just you need distinct subject count based on author_id

SELECT author_id FROM books 
GROUP BY author_id
HAVING COUNT(DISTINCT subject_id) > 2

Upvotes: 2

Related Questions