pierroz
pierroz

Reputation: 7870

Merge queries into one query

I have the two following tables (with some sample datas)

LOGS:

 ID | SETID | DATE
========================
 1  |   1   | 2010-02-25
 2  |   2   | 2010-02-25
 3  |   1   | 2010-02-26
 4  |   2   | 2010-02-26
 5  |   1   | 2010-02-27
 6  |   2   | 2010-02-27
 7  |   1   | 2010-02-28
 8  |   2   | 2010-02-28
 9  |   1   | 2010-03-01

STATS:

 ID | OBJECTID | FREQUENCY | STARTID | ENDID
=============================================
 1  |    1     |   0.5     |    1    |   5
 2  |    2     |   0.6     |    1    |   5
 3  |    3     |   0.02    |    1    |   5
 4  |    4     |   0.6     |    2    |   6
 5  |    5     |   0.6     |    2    |   6
 6  |    6     |   0.4     |    2    |   6
 7  |    1     |   0.35    |    3    |   7
 8  |    2     |   0.6     |    3    |   7
 9  |    3     |   0.03    |    3    |   7
 10 |    4     |   0.6     |    4    |   8
 11 |    5     |   0.6     |    4    |   8
 7  |    1     |   0.45    |    5    |   9
 8  |    2     |   0.6     |    5    |   9
 9  |    3     |   0.02    |    5    |   9

Every day new logs are analyzed on different sets of objects and stored in table LOGS. Among other processes, some statistics are computed on the objects contained into these sets and the result are stored in table STATS. These statistic are computed through several logs (identified by the STARTID and ENDID columns).

So, what could be the SQL query that would give me the latest computed stats for all the objects with the corresponding log dates.
In the given example, the result rows would be:

OBJECTID | SETID | FREQUENCY |  STARTDATE | ENDDATE
======================================================
   1     |   1   |    0.45   | 2010-02-27 | 2010-03-01
   2     |   1   |    0.6    | 2010-02-27 | 2010-03-01
   3     |   1   |    0.02   | 2010-02-27 | 2010-03-01
   4     |   2   |    0.6    | 2010-02-26 | 2010-02-28
   5     |   2   |    0.6    | 2010-02-26 | 2010-02-28

So, the most recent stats for set 1 are computed with logs from feb 27 to march 1 whereas stats for set 2 are computed from feb 26 to feb 28. object 6 is not in the results rows as there is no stat on it within the last period of time.

Last thing, I use MySQL.

Any Idea ?

Upvotes: 6

Views: 216

Answers (2)

Andomar
Andomar

Reputation: 238058

If there are no ties, you can use a filtering join. For example:

select   stats.objectid
,        stats.frequency
,        startlog.setid
,        startlog.date
,        endlog.date
from     stats
join     logs startlog
on       startlog.id = stats.startid
join     logs endlog
on       endlog.id = stats.endid
join     (
         select   objectid, max(endlog.date) as maxenddate
         from     stats
         join     logs endlog
         on       endlog.id = stats.endid
         group by objectid
         ) filter
on       stats.objectid = filter.objectid
         and filter.maxenddate = endlog.date
order by stats.objectid

Your example results appear to be slightly off, for example there is no row for objectid 5 where the frequency equals 0.35.

Upvotes: 1

Patrick
Patrick

Reputation: 15717

Does this query fit to your question ?

SELECT objectid, l1.setid, frequency, l1.date as startdate, l2.date as enddate 
FROM `logs` l1
INNER JOIN `stats` s ON (s.startid=l1.id)
INNER JOIN `logs` l2 ON (l2.id=s.endid)
INNER JOIN
 (
   SELECT setid, MAX(date) as date
   FROM `logs` l
   INNER JOIN `stats` s ON (s.startid=l.id) 
   GROUP BY setid
 ) d ON (d.setid=l1.setid and d.date=l1.date)
ORDER BY objectid

Upvotes: 3

Related Questions