ahkam
ahkam

Reputation: 767

How to group by a particular column and get the distinct set from it in postgres?

In my node app, I'm querying postgres table which looks's like this.

id,file,start_time,expire_time,status
1,file_1,2021-08-12 22:00:00,2021-08-15 22:00:00,completed
2,file_2,2021-08-12 22:00:00,2021-08-15 23:00:00,completed
3,file_3,2021-08-12 22:00:00,2021-08-15 24:00:00,completed
4,file_4,2021-08-13 17:00:00,2021-08-15 22:00:00,completed
5,file_2,2021-08-14 22:00:00,2021-08-15 13:00:00,completed
6,file_3,2021-08-14 22:00:00,2021-08-15 14:00:00,completed
7,file_1,2021-08-14 22:00:00,2021-08-15 215:00:00,completed

In the table, rows 1,2,3 and 5,6,7 are two same set of files and for those 2 sets, only the start_time is varying. So if u group by start time and take only the distinct set, finally the result should be like,

1,file_1,2021-08-12 22:00:00,2021-08-15 22:00:00,completed
2,file_2,2021-08-12 22:00:00,2021-08-15 23:00:00,completed
3,file_3,2021-08-12 22:00:00,2021-08-15 24:00:00,completed
4,file_4,2021-08-13 17:00:00,2021-08-15 22:00:00,completed

So i have tried many ways to group them. But Still i couldn't get a correct result. Below is the query

select distinct on (file) * 
from glacier_restore_progress
where status='completed'
group by start_time;

and i get an error

ERROR: column "glacier_restore_progress.id" must appear in the GROUP BY clause or be used in an aggregate function

What am I missing here?

Upvotes: 0

Views: 125

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269443

This answers the original version of the question.

If I understand correctly, you want one row per file based on either the earliest start_time. If so, you can use distinct on:

select distinct on (file) grp.*
from glacier_restore_progress grp
where status = 'completed'
order by file, start_time;

I notice that you got most of the way here. The key is that no group by is used in this case, only an order by that starts with the distinct on keys followed by the ordering you want within each key.

Here is a db<>fiddle.

Upvotes: 1

Related Questions