user1330974
user1330974

Reputation: 2616

Selecting the most recent entry (row) using datetime column

I have a table that looks like this:

table_name  |  event_time           |  row_count  |  num_of_times_observed
test        |  2016-11-30 15:33:47  |  200        |  0
test        |  2016-11-30 16:03:03  |  400        |  0
test11      |  2016-11-30 15:33:34  |  300        |  0
test11      |  2016-11-30 15:57:49  |  400        |  0

I would like to find the row that has the most recent value in event_timegiven a table name (or to be more general, by each distinct table name). For example, if we are to find the most recent time for table_name, 'test11', we'd get a result like this:

table_name  |  event_time           |  row_count  |  num_of_times_observed
test11      |  2016-11-30 15:57:49  |  400        |  0

There are two standard (beginner) approaches that I can think of to achieve that:

SELECT table_name -- Approach#1
    ,event_time
    ,row_count
    ,num_of_times_observed
FROM my_table AS u
WHERE table_name = 'test11'
ORDER BY event_time DESC LIMIT 1

Or this:

SELECT table_name -- Approach#2
    ,event_time
    ,row_count
    ,num_of_times_observed
FROM (
    SELECT *
    FROM my_table
    WHERE table_name = 'test11'
    ) AS u -- I really don't need to filter by table_name here, but I hope it will improve the performance by just a little (especially if there are >100K rows for 'test11')?
INNER JOIN (
    SELECT table_name
        ,max(event_time) AS event_time
    FROM my_table
    GROUP BY table_name
    HAVING table_name = 'test11'
    ) AS q ON u.table_name = q.table_name
    AND u.event_time = q.event_time

Assuming there are ~100 million rows in my_table, I feel like the above approaches may not be very efficient (optimized). I looked around StackOverflow for possible answers and found answers such as this one that are more advanced. I wonder if there are better (efficient) way to query for the desired result.

Thank you very much for your answers and suggestions!

Upvotes: 0

Views: 111

Answers (2)

ScaisEdge
ScaisEdge

Reputation: 133380

You can use also

for each table

select * from my_table 
where ( table_name, event_time) in ( select table_name, max(event_time) 
                                from my_table
                                    group by table_name )

or if you are using a db that not allowd the use of tuple you can use
a join

 select * from my_table t1 
 INNER JOIN (  
   select table_name, max(event_time) max_event 
   from my_table
   group by table_name ) t2 on t2.table_name = t1.table_name 
                                   and t2.max_event = t1.event_time

for absolute

select * from my_table as  u
where event_time in ( select max(event_time )  from my_table)

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270463

Your first approach is the best approach.

You want an index on my_table(table_name, event_time). Some databases allow you to specify desc on the column when creating the index.

Upvotes: 1

Related Questions