Stefan
Stefan

Reputation: 3051

Get "latest" row after GROUP BY over multiple tables

I'd preferably like to first query listed below and just group by stories.id, but I get the following error:

ERROR: column "u.first_name" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT "s".*, "u"."first_name", "u"."last_name", ("i"."filen...

The second query works but does not group by stories.id and generates the wrong results. Is it possible to select from multiple tables and not group by all of them?

The table panels also has a column updated_at. I would like to get the newest file per story according to panels.updated_at.

   SELECT 
        "s".*, 
        "u"."first_name", 
        "u"."last_name", 
        ("i"."filename" || '.' || "i"."extension") AS "file" 
    FROM 
        "stories" "s" 
    LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id") 
    LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
    LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id") 
    WHERE 
        "s"."complete" = false AND 
        "s"."created_by" = 205700489 
    GROUP BY 
        "s"."id", 
    ORDER BY 
        "s"."created_at" DESC

   SELECT 
        "s".*, 
        "u"."first_name", 
        "u"."last_name", 
        ("i"."filename" || '.' || "i"."extension") AS "file" 
    FROM 
        "stories" "s" 
    LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id") 
    LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
    LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id") 
    WHERE 
        "s"."complete" = false AND 
        "s"."created_by" = 205700489 
    GROUP BY 
        "s"."id", 
        "u"."first_name", 
        "u"."last_name", "i"."filename", 
        "i"."extension" 
    ORDER BY 
        "s"."created_at" DESC

Upvotes: 1

Views: 396

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658492

Updated after clarification of the question:

SELECT DISTINCT ON (s.created_at, s.id)
       s.*
      ,u.first_name
      ,u.last_name
      ,concat_ws('.', i.filename, i.extension) AS file
FROM   stories s 
LEFT   JOIN users  u ON u.uid = s.user_id
LEFT   JOIN panels p ON p.story_id = s.id
LEFT   JOIN images i ON i.id = p.image_id
WHERE  s.complete = false
AND    s.created_by = 205700489 
ORDER  BY s.created_at DESC, s.id, p.updated_at DESC;

Grouping by primary key requires PostgreSQL 9.1.
I use concat_ws(), because I don't know which columns might be NULL. If both i.filename and i.extension are defined NOT NULL, you can simplify.

Effect of the additional ORDER BY item p.updated_at DESC is that the "newest" file will be picked per story. The query technique is explained in full under this related question:
Select first row in each GROUP BY group?

Upvotes: 2

Ihor Romanchenko
Ihor Romanchenko

Reputation: 28591

You can write something like:

   SELECT 
        "s".*, 
        (SELECT "u"."first_name" 
         FROM "users" "u"
         WHERE "s"."user_id" = "u"."uid"
         LIMIT 1) , 
        (SELECT "u"."last_name" 
         FROM "users" "u"
         WHERE "s"."user_id" = "u"."uid"
         LIMIT 1), 
        (SELECT "i"."filename" || '.' || "i"."extension"
         FROM "panels" "p" 
         JOIN "images" "i" ON ("p"."image_id" = "i"."id")
         WHERE "p"."story_id" = "s"."id" 
         LIMIT 1) AS "file" 
    FROM 
        "stories" "s" 
    WHERE 
        "s"."complete" = false AND 
        "s"."created_by" = 205700489 
    ORDER BY 
        "s"."created_at" DESC

It will get only 1 record from "users" and "panels" JOIN "images" per record in "stories" .

Add ORDER BY, extra WHERE or some aggregates to get what you need from "users" and "panels" JOIN "images"

UPD Also, you can use something like this:

SELECT *
FROM (
SELECT DISTINCT ON ("s"."id") 
    "s".*, 
    "u"."first_name", 
    "u"."last_name", 
    ("i"."filename" || '.' || "i"."extension") AS "file" 
FROM 
    "stories" "s" 
LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id") 
LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id") 
WHERE 
    "s"."complete" = false AND 
    "s"."created_by" = 205700489 
ORDER BY 
    "s"."id"
) t ORDER BY "t"."created_at" DESC

It will leave only one row for every distinct "s"."id"

Upvotes: 0

Related Questions