reans
reans

Reputation: 430

Mysql: group by latest query optimization

I've been stuck on optimizing this query. I'm hoping I haven't over-simplified my example.

I have a table the following tables:

Athletes: id, name, team_id

Measurements: id, type, value, athlete_id, measured_at

I need to find the latest, oldest and penultimate measurements and so I wrote this query:

select 
    m.athlete_id, 
    m.`type`,
    (
        select ml.`value`
        from measurements ml 
        where ml.athlete_id = m.athlete_id
        and ml.`type` = m.`type`
        order by ml.measured_at desc limit 1
    )
    as latest_record,
    (
        select mo.`value`
        from measurements mo 
        where mo.athlete_id = m.athlete_id
        and mo.`type` = m.`type`
        order by mo.measured_at asc limit 1
    )
    as oldest_record,
    (
        select mp.`value`
        from measurements mp
        where mp.athlete_id = m.athlete_id
        and mp.`type` = m.`type`
        order by mp.measured_at desc limit 1 offset 1
    )
    as penultimate_record
    from measurements m
    group by m.athlete_id, m.`type`;

This gets all I asked for but it's not performant, or at least I'd need to modify other things in my application if it must be this slow.

I've also tried moving the select logic to left lateral joins eg.

left join lateral
(
    SELECT mlr.id as measurement_id, mlr.value as `value` 
    FROM measurements mlr 
    WHERE mlr.athlete_id = s.id
    and mlr.type = v.type
    ORDER BY mlr.measured_at ASC LIMIT 1
) 
as latest_record on m.id = latest_record.measurement_id 

but this is a similar speed to the above. I also tried putting database indexes on columns, but didn't see any improvement.

Is there an index I can put here, or another way to write this to get the query to perform better?

thanks

Upvotes: 0

Views: 43

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269493

I would suggest conditional aggregation:

select m.athlete_id, m.`type`,
       max(case when seqnum_desc = 1 then value end) as latest_record,
       max(case when seqnum_asc = 1 then value end) as oldest_record,
       max(case when seqnum_desc = 2 then value end) as penultimate_record
    from (select m.*,
                 row_number() over (partition by athlete_id, type order by measured_at asc) as seqnum_asc,
                 row_number() over (partition by athlete_id, type order by measured_at desc) as seqnum_desc
          from measurements m
         ) m
group by m.athlete_id, m.`type`;

This still incurs the overhead for grouping -- which might be expensive. You might want an index on measures(athlete_id, type, measured_at).

Upvotes: 3

Related Questions