Chao Miin
Chao Miin

Reputation: 43

Get the maximum value from rows in Postgres records and group by multiple columns

I have a table like this:

p_id |      createdat      | pagetitle | sessionid |      text       | device  | deviceserial
------+---------------------+-----------+-----------+-----------------+---------+--------------
      | 2020-11-27 08:07:39 |           |           | App launch      | android | 636363636890
      | 2020-09-01 08:08:18 |           |           | search          | Android | 636363636890
      | 2020-09-02 08:10:10 |           |           | scan            | Android | 636363636890
      | 2020-09-02 08:12:10 |           |           | destroy         | Android | 636363636890
      | 2020-09-02 08:40:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:45:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:43:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 08:50:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:47:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 08:53:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:50:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 08:55:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:52:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:00:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:55:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:05:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:59:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:07:11 |           |           | hi              | Android | 6625839827
      | 2020-09-02 09:01:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:09:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 09:03:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:09:11 |           |           | hi              | Android | 6625839828
      | 2020-09-02 09:03:10 |           |           | launchComponent | IOS     | 636363636891
      | 2020-09-02 09:13:11 |           |           | hi              | Android | 6625839828
      | 2020-09-02 09:06:10 |           |           | launchComponent | IOS     | 636363636891

From this table ,I want to achieve something like this:

deviceserial | event_count |         hr             device
--------------+-------------+---------------------+---------------------
 6625839828   |           2 | 2020-09-02 09:00:00 |Android
 636363636890 |           8 | 2020-09-02 08:00:00 |Android
 636363636891 |           2 | 2020-09-02 09:00:00 |IOS
 6625839827   |           5 | 2020-09-02 08:00:00 |IOS
 

This is my steps:I grouped by records with deviceserial ,hourly as hr and device and count max(event_count).

I tried this query:

select deviceserial,max(event_count) as event_count,hr,device
from (
    select deviceserial,count(*) as event_count,
        date_trunc('hour', createdat) as hr,device
    from devices  
    group by deviceserial,hr,device
) t
group by deviceserial,hr,device

This is my result:

 deviceserial | event_count |         hr            device
--------------+-------------+---------------------+---------------------
636363636890      1          2020-11-27 08:00:00  |        android
636363636891      2          2020-09-02 09:00:00  |        IOS
6625839827        4          2020-09-02 09:00:00  |        IOS
6625839827        5          2020-09-02 08:00:00  |        IOS
636363636890      8          2020-09-02 08:00:00  |       Android
636363636890      1          2020-09-01 08:00:00  |       Android
636363636890      2          2020-09-02 09:00:00  |       Android
6625839828        2          2020-09-02 09:00:00  |       Android

Upvotes: 1

Views: 71

Answers (2)

GMB
GMB

Reputation: 222702

If I follow you correctly, you can use distinct on:

select distinct on (deviceserial) 
    deviceserial,
    count(*) as event_count,
    date_trunc('hour', createdat) as hr,
    device
from devices  
group by deviceserial, hr, device
order by deviceserial, event_count desc

This gives you the hour / device where each device serial had most events. Note, however, that this does not properly handle ties (this gives only one row per device serial). If you wanted to allow top ties, you would use rank() instead:

select *
from (
    select deviceserial,
        count(*) as event_count,
        date_trunc('hour', createdat) as hr,
        device,
        rank() over(partition by deviceserial order by event_count desc) rn
    from devices  
    group by deviceserial, hr, device
) t
where rn = 1
order by deviceserial

Or, in Postgres 13:

select deviceserial,
    count(*) as event_count,
    date_trunc('hour', createdat) as hr,
    device
from devices  
group by deviceserial, hr, device
order by rank() over(partition by deviceserial order by event_count desc)
fetch first row with ties

Upvotes: 2

Popeye
Popeye

Reputation: 35930

You can use window function rank() as follows:

select * from
(select deviceserial,count(*) as event_count,
        date_trunc('hour', createdat) as hr, device,
        rank() over (partition by deviceserial order by count(*) desc) as rn
    from devices  
    group by deviceserial,hr,device)
where rn = 1

Upvotes: 0

Related Questions