jamesatha
jamesatha

Reputation: 7600

sql query to get the most recent piece of data for a given set of columns

I have a postgres table that represents a history for 'elements' in my table. An element is determined by a grouping of 2 columns

The historic data has actions that occurred like "saved" or "deleted" which can happen many times for any element. These are mutually exclusive so when I look at the current state, care about the most recent action.

I want to count the number of "saved" events for an element. This needs to be the count of the most recent historic event that is a "saved" for every element. Is there one SQL query that does this well? I have 2 that I want to condense.

Table columns: time (timestamp), user_id (int), file_name (string), action_type (string)

The user_id + file_name combo determines an 'element' I care about but this combo is not unique in the table

What I want: Get the count of the file "index.html" for all users who most recently "saved" it as their action

Upvotes: 1

Views: 519

Answers (2)

You could use something like below. It first of all filters out all the SAVED records for index.html. Then it discards any of those records where there exists a new record for the same user and file.

So if user1 saved the file at 09:00 and at 12:00, only the 12:00 record would count.

select count(*) from tableName t
where t.action_type = 'SAVED'
and t.filename = 'index.html'
and not exists 
   (select * from tableName t2
    where t2.user_id = t.user_id and t2.filename = t.filename
    and t2.time > t.time
    and t2.action_type = 'SAVED')

If you want it so that a more recent delete record stops save being counted, remove the and t2.action_type = 'SAVED' bit

That way if user1 saved the file at 09:00 and deleted it at 11:00, none of the records would count. The deleted record isn't included in the results (because of t.action_type = 'SAVED'), but it is newer than the 09:00 record so stops that showing.

Upvotes: 1

Freelancer
Freelancer

Reputation: 9064

Since 'most recently' has not proper definition query can be:

select count(*) from tlbName where filename='index.html' and action='saved'

If time is there then:

select count(*) from tlbName where filename='index.html' and action='saved' and timestamp between 'timeval_1' and 'timeval_2'

Hope its helpful.

Upvotes: 0

Related Questions