Ken
Ken

Reputation: 78922

How to select the most recent set of dated records from a mysql table

I am storing the response to various rpc calls in a mysql table with the following fields:

Table: rpc_responses

timestamp   (date)
method      (varchar)
id          (varchar)
response    (mediumtext)

PRIMARY KEY(timestamp,method,id)

What is the best method of selecting the most recent responses for all existing combinations of method and id?

Sample data:

timestamp  method  id response
2009-01-10 getThud 16 "....."
2009-01-10 getFoo  12 "....."
2009-01-10 getBar  12 "....."
2009-01-11 getFoo  12 "....."
2009-01-11 getBar  16 "....."

Desired result:

2009-01-10 getThud 16 "....."
2009-01-10 getBar 12 "....."
2009-01-11 getFoo 12 "....."
2009-01-11 getBar 16 "....."

(I don't think this is the same question - it won't give me the most recent response)

Upvotes: 25

Views: 104861

Answers (8)

questionto42
questionto42

Reputation: 9610

Checking the three main answers in some other use case shows that the most voted answer is also by far the fastest, swarm intelligence works here:

# Answer 1: https://stackoverflow.com/a/12625667/11154841
# 165ms
SELECT
    COUNT(0)
FROM
    (
    SELECT
        mtn.my_primary_key,
        mtn.my_info_col,
        IF(@last_my_primary_key = my_primary_key,
        0,
        1) AS new_my_primary_key_group,
        @last_my_primary_key := my_primary_key
    FROM
        my_db_schema.my_table_name mtn
    WHERE
        mtn.date_time_col > now() - INTERVAL 1 MONTH
    ORDER BY
        my_primary_key,
        mtn.date_time_col DESC
) AS t1
WHERE
    new_my_primary_key_group = 1
    AND t1.my_info_col = 'delete';

# Answer 2: https://stackoverflow.com/a/435709/11154841
# 757ms
SELECT
    count(0)
FROM
    my_db_schema.my_table_name mtn
JOIN
(
    SELECT
        my_primary_key,
        max(date_time_col) AS date_time_col
    FROM
        my_db_schema.my_table_name mtn
    WHERE
        mtn.date_time_col > now() - INTERVAL 1 MONTH
    GROUP BY
        mtn.my_primary_key) latest
        USING (my_primary_key,
    date_time_col)
WHERE
    mtn.my_info_col = 'delete';

# Answer 3: https://stackoverflow.com/a/3185644/11154841
# 1.310s
SELECT
    count(0)
FROM
    my_db_schema.my_table_name mtn
WHERE
    mtn.date_time_col = (
    SELECT
        max(mtn2.date_time_col)
    FROM
        my_db_schema.my_table_name mtn2
    WHERE
        mtn2.my_primary_key = mtn.my_primary_key
        AND mtn2.date_time_col > now() - INTERVAL 1 MONTH
)
    AND mtn.date_time_col > now() - INTERVAL 1 MONTH
    AND mtn.my_info_col = 'delete';

Upvotes: 0

Ken
Ken

Reputation: 78922

Self answered, but I'm not sure that it will be an efficient enough solution as the table grows:

SELECT timestamp,method,id,response FROM rpc_responses 
INNER JOIN
(SELECT max(timestamp) as timestamp,method,id FROM rpc_responses GROUP BY method,id) latest
USING (timestamp,method,id);

Upvotes: 15

velcrow
velcrow

Reputation: 6516

This solution was updated recently.
Comments below may be outdated

This can query may perform well, because there are no joins.

SELECT * FROM (
    SELECT *,if(@last_method=method,0,1) as new_method_group,@last_method:=method 
    FROM rpc_responses 
    ORDER BY method,timestamp DESC
) as t1
WHERE new_method_group=1;

Given that you want one resulting row per method this solution should work, using mysql variables to avoid a JOIN.

FYI, PostgreSQL has a way of doing this built into the language:

SELECT DISTINCT ON (method) timestamp, method, id, response
FROM rpc_responses
WHERE 1 # some where clause here
ORDER BY method, timestamp DESC

Upvotes: 30

versek
versek

Reputation: 61

Try this...

SELECT o1.id, o1.timestamp, o1.method, o1.response   
FROM rpc_responses o1
WHERE o1.timestamp = ( SELECT max(o2.timestamp)
                       FROM rpc_responses o2
                       WHERE o1.id = o2.id )
ORDER BY o1.timestamp, o1.method, o1.response

...it even works in Access!

Upvotes: 6

Simon
Simon

Reputation: 17

Subquery is very taxing when the data set becomes larger.

Try this:

SELECT t1.* 
FROM rpc_responses AS t1 
INNER JOIN rpc_responses AS t2 
GROUP BY t1.method, t1.id, t1.timestamp
HAVING t1.timestamp=MAX(t2.timestamp)    
ORDER BY t1.timestamp, t1.method, t1.response;

Upvotes: 0

charles
charles

Reputation: 307

i used this,worked for me

select max(timestamp),method,id from tables where 1 group by method,id order by timestamp desc 

Upvotes: 0

Neil
Neil

Reputation: 2296

The concept of "most recent" is fairly vague. If you mean something like the 100 most recent rows then you can just add a TOP(100) to your SELECT clause.

If you mean the "most recent" based on a the most recent date then you can just do

SELECT timestamp,method,id,response 
FROM rpc_responses
HAVING max(timestamp) = timestamp 

Upvotes: -1

spi
spi

Reputation: 41

...is more than one year later but i might help someone To select all the queries starting from latest

SELECT *
FROM rpc_responses
ORDER BY timestamp DESC

Upvotes: -4

Related Questions