joinjohn
joinjohn

Reputation: 13

How to join multiple times on the same table with the max value for each row

I have a table activity that has 5 fields:

id - numeric
user_id - numeric 
type_id - numeric
content_id - numeric
date_of_activity date

I need to make a query, the result of which will be 7 columns:

id(type1 activity),content_id(type 1) at max(date_of_activity for type 1),max(date_of_activity for type 1)content_id(type 2) at max(date_of_activity for type 2),max(date_of_activity for type 2)content_id(type 3) at max(date_of_activity for type 3),max(date_of_activity for type 3)

If there is no corresponding value for type_id for an user then NULL value should be shown in corresponding column.

Example:

users
    user_id: 1
    user_id: 2
    user_id: 3
activities

    id: 1
    user_id: 1
    type_id: 2
    date_of_activity: 2021/05/01
    content_id: 12

    id: 2
    user_id: 1
    type_id: 3
    date_of_activity: 2021/05/01
    content_id: 102

    id: 3
    user_id: 2
    type_id: 3
    date_of_activity: 2021/05/01
    content_id: 213

    id: 4
    user_id: 1
    type_id: 2
    date_of_activity: 2021/05/02
    content_id: 13

    id: 5
    user_id: 1
    type_id: 3
    date_of_activity: 2021/05/02
    content_id: 103

    id: 6
    user_id: 1
    type_id: 1
    date_of_activity: 2021/05/03
    content_id: 1

    id: 7
    user_id: 2
    type_id: 1
    date_of_activity: 2021/05/03
    content_id: 212
    
    id: 8
    user_id: 1
    type_id: 3
    date_of_activity: 2021/05/03
    content_id: 104

    id: 8
    user_id: 1
    type_id: 3
    date_of_activity: 2021/05/04
    content_id: 105

    id: 10
    user_id: 1
    type_id: 1
    date_of_activity: 2021/05/05
    content_id: 2

    id: 11
    user_id: 1
    type_id: 3
    date_of_activity: 2021/05/05
    content_id: 106

the result should be:

id:10,
user_id:1,
type_id:1,
date_of_activity: 2021/05/05,
activity2.date_of_activity:2021/05/02,
activity2.content_id:13,
activity3.date_of_activity: 2021/05/05,
activity3.content_id:106

id:7,
user_id: 2,
type_id: 1,
date_of_activity: 2021/05/03,
content_id: 212,
activity2.date_of_activity: NULL,
activity2.content_id: NULL,
activity3.date_of_activity:2021/05/01
activity3.content_id:213

As you can see i retreived one activity , per activity type for each user, each activity being the most recent one, and so on for all the users...

I have a query than can retrieve the most recent activity for each user for ONE activity type :

SELECT
activities.*,
dDates.max
FROM activities
JOIN users on users.id = activities.user_id
INNER JOIN (SELECT user_id,max(activities.date_of_activity) FROM activities WHERE activities.type_id = 6 AND activities.deleted_at IS NULL group by user_id) as dDates on activities.user_id = maxDates.user_id
WHERE activities.type_id = 6 AND activities.deleted_at IS NULL AND activities.date_of_activity = dDates.max;

But sometimes i need to join on multiples activities as part of the same query so i'm trying to grab everything at once but i can't find a way that isn't very slow.

i've tried created a view / use "With" for each type :

CREATE VIEW MostRecentType1 AS
SELECT
activities.*,
dDates.max
FROM activities
JOIN users on users.id = activities.user_id
INNER JOIN (SELECT user_id,max(activities.date_of_activity) FROM activities WHERE activities.type_id = 1 AND activities.deleted_at IS NULL group by user_id) as dDates on activities.user_id = maxDates.user_id
WHERE activities.type_id = 1 AND activities.deleted_at IS NULL AND activities.date_of_activity = dDates.max;
CREATE VIEW MostRecentType2 AS
SELECT
activities.*,
dDates.max
FROM activities
JOIN users on users.id = activities.user_id
INNER JOIN (SELECT user_id,max(activities.date_of_activity) FROM activities WHERE activities.type_id = 1 AND activities.deleted_at IS NULL group by user_id) as dDates on activities.user_id = maxDates.user_id
WHERE activities.type_id = 1 AND activities.deleted_at IS NULL AND activities.date_of_activity = dDates.max;

and then a finally a left join on the views

SELECT
activities.*,
dDates.max,
type2.date_of_activity as "type2.date_of_activity as",
type2.content_id as "type2.content_id",
type3.date_of_activity as "type3.date_of_activity" ,
type3.content_id as "type3.content_id" ,
FROM activities
JOIN users on users.id = activities.user_id
INNER JOIN (SELECT user_id,max(activities.date_of_activity) FROM activities WHERE activities.type_id = 1 AND activities.deleted_at IS NULL group by user_id) as dDates on activities.user_id = maxDates.user_id
LEFT JOIN MostRecentType2 as type2 on activities.user_id = type2.user_id
LEFT JOIN MostRecentType3 as type3 on activities.user_id = type3.user_id
WHERE activities.type_id = 1 AND activities.deleted_at IS NULL AND activities.date_of_activity = dDates.max;

But this doesn't scale well at all.

On my mostly empty activities table if i just perform the 1st query alone it will take 4ms, but once i add the first "MostRecentType2" join it jumps to 40ms, then MostRecentType3 go to 80 ms and keeps on incrementing this way and in some case i need lots of joins.

Any ideas on how to (better) make that query ?

Upvotes: 1

Views: 42

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269763

Hmmm . . . You can use DISTINCT ON to get the most recent row for each type for each user. Then just use conditional aggregation:

SELECT user_id,
       MAX(date_of_activity) FILTER (WHERE type_id = 1) as date_1,
       MAX(content_id) FILTER (WHERE type_id = 1) as content_1,
       MAX(date_of_activity) FILTER (WHERE type_id = 2) as date_2,
       MAX(content_id) FILTER (WHERE type_id = 2) as content_2,
       MAX(date_of_activity) FILTER (WHERE type_id = 3) as date_3,
       MAX(content_id) FILTER (WHERE type_id = 3) as content_3
FROM (SELECT DISTINCT ON (user_id, type_id) a.*
      FROM activities a
      WHERE a.deleted_at IS NULL
      ORDER BY user_id, type_id, date_of_activity DESC
     ) a
GROUP BY user_id

Upvotes: 0

Related Questions