Reputation: 35
I have the following tables:
CREATE TABLE `data` (
`date_time` decimal(26,6) NOT NULL,
`channel_id` mediumint(8) unsigned NOT NULL,
`value` varchar(40) DEFAULT NULL,
`status` tinyint(3) unsigned DEFAULT NULL,
`connected` tinyint(1) unsigned NOT NULL,
PRIMARY KEY (`channel_id`,`date_time`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `channels` (
`channel_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`channel_name` varchar(40) NOT NULL,
PRIMARY KEY (`channel_id`),
UNIQUE KEY `channel_name` (`channel_name`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
I was wondering if anyone could give me some advice on how to optimize or rewrite the following query:
SELECT channel_name, t0.date_time, t0.value, t0.status, t0.connected, t1.date_time, t1.value, t1.status, t1.connected FROM channels,
(SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
WHERE date_time <= 1300818330
GROUP BY channel_id) AS t0
RIGHT JOIN
(SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
WHERE date_time <= 1300818334
GROUP BY channel_id) AS t1
ON t0.channel_id = t1.channel_id
WHERE channels.channel_id = t1.channel_id
Basically I am getting the value, status and connected fields for each channel_name at two different times. Since t0 is always <= t1, the fields could exist for t1, but not t0, and I want that to be shown. That is why I am using the RIGHT JOIN. If it does not exist for t1, then it won't exist for t0, so no row should be returned.
The problem seems to be that since I am joining sub queries, no index can be used? I tried rewriting it to do a self join on the channel_id of the data table first but that is millions of rows.
It would also be nice to be able to add a boolean field to each of the final rows that is true when t0.value = t1.value & t0.status = t1.status & t0.connected = t1.connected.
Thank you very much for your time.
Upvotes: 1
Views: 5506
Reputation: 107746
You can reduce the two sub-queries to one
SELECT channel_id,
MAX(date_time) AS t1_date_time,
MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id
GROUP BY is notoriously misleading in MySQL. Imagine if you had MIN() and MAX() in the same select, which row should the non-grouped columns come from? Once you understand this, you will see why it is not deterministic.
To get the full t0 and t1 rows
SELECT x.channel_id,
t0.date_time, t0.value, t0.status, t0.connected,
t1.date_time, t1.value, t1.status, t1.connected
FROM (
SELECT channel_id,
MAX(date_time) AS t1_date_time,
MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
And finally a join to get the channel name
SELECT c.channel_name,
t0.date_time, t0.value, t0.status, t0.connected,
t1.date_time, t1.value, t1.status, t1.connected,
t0.value=t1.value AND t1.status=t0.status
AND t0.connected=t1.connected name_me
FROM (
SELECT channel_id,
MAX(date_time) AS t1_date_time,
MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id
) x
INNER JOIN channels c on c.channel_id = x.channel_id
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
To perform an RLIKE on channel name, it looks simple enough to add a WHERE clause at the end of the query on c.channel_name
. It may however perform better to filter it at the subquery, making use of MySQL feature of processing comma-notation joins left to right.
SELECT x.channel_name,
t0.date_time, t0.value, t0.status, t0.connected,
t1.date_time, t1.value, t1.status, t1.connected,
t0.value=t1.value AND t1.status=t0.status
AND t0.connected=t1.connected name_me
(
SELECT c.channel_id, c.channel_name,
MAX(d.date_time) AS t1_date_time,
MAX(case when d.date_time <= {$p1} then d.date_time end) AS t0_date_time
FROM channels c, data d
WHERE c.channel_name RLIKE {$expr}
AND c.channel_id = d.channel_id
AND d.date_time <= {$p2}
GROUP BY c.channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
Upvotes: 2