Reputation: 7870
I have the two following tables (with some sample datas)
LOGS:
ID | SETID | DATE ======================== 1 | 1 | 2010-02-25 2 | 2 | 2010-02-25 3 | 1 | 2010-02-26 4 | 2 | 2010-02-26 5 | 1 | 2010-02-27 6 | 2 | 2010-02-27 7 | 1 | 2010-02-28 8 | 2 | 2010-02-28 9 | 1 | 2010-03-01
STATS:
ID | OBJECTID | FREQUENCY | STARTID | ENDID ============================================= 1 | 1 | 0.5 | 1 | 5 2 | 2 | 0.6 | 1 | 5 3 | 3 | 0.02 | 1 | 5 4 | 4 | 0.6 | 2 | 6 5 | 5 | 0.6 | 2 | 6 6 | 6 | 0.4 | 2 | 6 7 | 1 | 0.35 | 3 | 7 8 | 2 | 0.6 | 3 | 7 9 | 3 | 0.03 | 3 | 7 10 | 4 | 0.6 | 4 | 8 11 | 5 | 0.6 | 4 | 8 7 | 1 | 0.45 | 5 | 9 8 | 2 | 0.6 | 5 | 9 9 | 3 | 0.02 | 5 | 9
Every day new logs are analyzed on different sets of objects and stored in table LOGS. Among other processes, some statistics are computed on the objects contained into these sets and the result are stored in table STATS. These statistic are computed through several logs (identified by the STARTID and ENDID columns).
So, what could be the SQL query that would give me the latest computed stats for all the objects with the corresponding log dates.
In the given example, the result rows would be:
OBJECTID | SETID | FREQUENCY | STARTDATE | ENDDATE ====================================================== 1 | 1 | 0.45 | 2010-02-27 | 2010-03-01 2 | 1 | 0.6 | 2010-02-27 | 2010-03-01 3 | 1 | 0.02 | 2010-02-27 | 2010-03-01 4 | 2 | 0.6 | 2010-02-26 | 2010-02-28 5 | 2 | 0.6 | 2010-02-26 | 2010-02-28
So, the most recent stats for set 1 are computed with logs from feb 27 to march 1 whereas stats for set 2 are computed from feb 26 to feb 28. object 6 is not in the results rows as there is no stat on it within the last period of time.
Last thing, I use MySQL.
Any Idea ?
Upvotes: 6
Views: 216
Reputation: 238058
If there are no ties, you can use a filtering join. For example:
select stats.objectid
, stats.frequency
, startlog.setid
, startlog.date
, endlog.date
from stats
join logs startlog
on startlog.id = stats.startid
join logs endlog
on endlog.id = stats.endid
join (
select objectid, max(endlog.date) as maxenddate
from stats
join logs endlog
on endlog.id = stats.endid
group by objectid
) filter
on stats.objectid = filter.objectid
and filter.maxenddate = endlog.date
order by stats.objectid
Your example results appear to be slightly off, for example there is no row for objectid 5 where the frequency equals 0.35.
Upvotes: 1
Reputation: 15717
Does this query fit to your question ?
SELECT objectid, l1.setid, frequency, l1.date as startdate, l2.date as enddate
FROM `logs` l1
INNER JOIN `stats` s ON (s.startid=l1.id)
INNER JOIN `logs` l2 ON (l2.id=s.endid)
INNER JOIN
(
SELECT setid, MAX(date) as date
FROM `logs` l
INNER JOIN `stats` s ON (s.startid=l.id)
GROUP BY setid
) d ON (d.setid=l1.setid and d.date=l1.date)
ORDER BY objectid
Upvotes: 3