Shaharyar
Shaharyar

Reputation: 12449

Sorted data in groups

Dataset:

id  uid     activity    postid  
1   20      A           1
2   20      A           1
3   6       A           1
4   3       A           1
5   6       A           1
6   13      A           1
7   13      B           1
8   18      B           1
9   18      B           1
10  1       A           1

Current Results:

id  uid     uid_list        groupCount  activity    postid
9   18      18,13           3           B           1
1   20      1,13,6,3,20     7           A           1

Expected Results:

id  uid     uid_list        groupCount  activity    postid
9   18      18,13           3           B           1
10  1       1,13,6,3,20     7           A           1

The query I have:

SELECT
    id,
    uid,
    GROUP_CONCAT(DISTINCT uid ORDER BY id DESC) as uid_list,
    COUNT(*) as groupCount,
    activity,
    postid
FROM (
    SELECT *
    FROM `user_activities`
    ORDER BY id DESC) as tbl
GROUP BY
    activity, postid
ORDER BY
    id DESC

I want to group by activity and postid while having the result in descending order by id. And want to have the latest id and uid for every group. I don't understand why this query doesn't return the expected output.

Upvotes: 4

Views: 53

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269883

Probably the simplest method is the group_concat()/substring_index() trick:

SELECT MAX(ID) as id,
       SUBSTRING_INDEX(GROUP_CONCAT(uid ORDER BY ID DESC), ',', 1) as uid,
       GROUP_CONCAT(DISTINCT uid ORDER BY id DESC) as uid_list,
       COUNT(*) as groupCount,
       activity, postid
FROM user_activities ua
GROUP BY activity, postid
ORDER BY id DESC;

There are some limitations to this approach, in the sense that GROUP_CONCAT() has a maximum length for the intermediate value. Typically the default is sufficient, but you might need to change that value if many, many rows match each group (and you already have this issue for the list of uids anyway).

Upvotes: 0

Kamil Gosciminski
Kamil Gosciminski

Reputation: 17157

From what I understand id value is increasing. To get the latest values you could use an aggregate function MAX().

Also, your inner query with ordering is unnecessary because the engine has to sort the resultset by id anyways when building result for GROUP_CONCAT().

To retrieve uid for a particular id column you need to self join to the same table.

SELECT
    a.id, b.uid, a.uid_list, a.groupcount, a.activity, a.postid
FROM (
    SELECT
        MAX(id) as id,
        GROUP_CONCAT(DISTINCT uid ORDER BY id DESC) as uid_list,
        COUNT(*) as groupCount,
        activity,
        postid
    FROM user_activities a
    GROUP BY
        activity, postid
    ) a
    INNER JOIN user_activities b ON a.id = b.id

Upvotes: 2

Related Questions