Reputation: 758
I have a query like this:
SELECT DISTINCT devices1_.id AS id27_, devices1_.createdTime AS createdT2_27_, devices1_.deletedOn AS deletedOn27_,
devices1_.deviceAlias AS deviceAl4_27_, devices1_.deviceName AS deviceName27_, devices1_.deviceTypeId AS deviceT21_27_,
devices1_.equipmentVendor AS equipmen6_27_, devices1_.exceptionDetail AS exceptio7_27_, devices1_.hardwareVersion AS hardware8_27_,
devices1_.ipAddress AS ipAddress27_, devices1_.isDeleted AS isDeleted27_, devices1_.loopBack AS loopBack27_,
devices1_.modifiedTime AS modifie12_27_, devices1_.osVersion AS osVersion27_, devices1_.productModel AS product14_27_,
devices1_.productName AS product15_27_, devices1_.routerType AS routerType27_, devices1_.rundate AS rundate27_,
devices1_.serialNumber AS serialN18_27_, devices1_.serviceName AS service19_27_, devices1_.siteId AS siteId27_,
devices1_.siteIdA AS siteIdA27_, devices1_.status AS status27_, devices1_.creator AS creator27_, devices1_.lastModifier AS lastMod25_27_
FROM goldenvariation goldenconf0_
INNER JOIN devices devices1_ ON goldenconf0_.deviceId=devices1_.id
CROSS JOIN devices devices2_
WHERE goldenconf0_.deviceId=devices2_.id
AND (goldenconf0_.classType = 'policy-options')
AND DATE(goldenconf0_.rundate)=DATE('2014-04-14 00:00:00')
AND devices2_.isDeleted=0
AND EXISTS (SELECT DISTINCT(deviceId) FROM goldenvariation goldenconf3_
WHERE (goldenconf3_.goldenVariationType = 'MISMATCH')
AND (goldenconf3_.classType = 'policy-options')
AND DATE(goldenconf3_.rundate)=DATE('2014-04-14 00:00:00'))
AND EXISTS (SELECT DISTINCT (deviceId) FROM goldenvariation goldenconf4_
WHERE (goldenconf4_.goldenVariationType = 'MISSING')
AND (goldenconf4_.classType = 'policy-options')
AND DATE(goldenconf4_.rundate)=DATE('2014-04-14 00:00:00'));
Its taking too much time, how i can rewrite the query and make it fast?
Table structure of goldervariation is:
CREATE TABLE `goldenvariation` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`classType` VARCHAR(255) DEFAULT NULL,
`createdTime` DATETIME DEFAULT NULL,
`goldenValue` LONGTEXT,
`goldenXpath` VARCHAR(255) DEFAULT NULL,
`isMatched` TINYINT(1) DEFAULT NULL,
`modifiedTime` DATETIME DEFAULT NULL,
`pathValue` LONGTEXT,
`rundate` DATETIME DEFAULT NULL,
`value` LONGTEXT,
`xpath` VARCHAR(255) DEFAULT NULL,
`deviceId` BIGINT(20) DEFAULT NULL,
`goldenXpathId` BIGINT(20) DEFAULT NULL,
`creator` INT(10) UNSIGNED DEFAULT NULL,
`lastModifier` INT(10) UNSIGNED DEFAULT NULL,
`goldenVariationType` VARCHAR(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FK6804472AD99F2D15` (`deviceId`),
KEY `FK6804472A98002838` (`goldenXpathId`),
KEY `FK6804472A27C863B` (`creator`),
KEY `FK6804472A3617A57C` (`lastModifier`),
KEY `rundateindex` (`rundate`),
KEY `varitionidindex` (`id`),
KEY `classTypeindex` (`classType`),
CONSTRAINT `FK6804472A27C863B` FOREIGN KEY (`creator`) REFERENCES `users` (`userid`),
CONSTRAINT `FK6804472A3617A57C` FOREIGN KEY (`lastModifier`) REFERENCES `users` (`userid`),
CONSTRAINT `FK6804472A98002838` FOREIGN KEY (`goldenXpathId`) REFERENCES `goldenconfigurationxpath` (`id`),
CONSTRAINT `FK6804472AD99F2D15` FOREIGN KEY (`deviceId`) REFERENCES `devices` (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1868865 DEFAULT CHARSET=latin1;
And explain plan of query is :
"1" "PRIMARY" "goldenconf0_" "ref" "FK6804472AD99F2D15,classTypeindex" "classTypeindex" "258" "const" "179223" "Using where; Using temporary"
"1" "PRIMARY" "devices2_" "eq_ref" "PRIMARY,deviceindex" "PRIMARY" "8" "cmdb.goldenconf0_.deviceId" "1" "Using where"
"1" "PRIMARY" "devices1_" "eq_ref" "PRIMARY,deviceindex" "PRIMARY" "8" "cmdb.goldenconf0_.deviceId" "1" ""
"3" "DEPENDENT SUBQUERY" "goldenconf4_" "index_subquery" "FK6804472AD99F2D15,classTypeindex" "FK6804472AD99F2D15" "9" "func" "19795" "Using where"
"2" "DEPENDENT SUBQUERY" "goldenconf3_" "index_subquery" "FK6804472AD99F2D15,classTypeindex" "FK6804472AD99F2D15" "9" "func" "19795" "Using where"
Upvotes: 1
Views: 69
Reputation: 108400
Yes, the query could be rewritten to improve performance (though it looks like a query generated by Hibernate, and getting Hibernate to use a different query can be a challenge.)
How sure are you that this query is returning the resultset you expect? Because the query is rather odd.
In terms of performance, dollars to donuts, its the repeated executions of the dependent subqueries that are really eating your lunch, and your lunchbox, in terms of performance. It looks like MySQL is using the index on the deviceId
column to satisfy that subquery, and that doesn't look like the most appropriate index.
We notice that there are two JOIN operations to the devices table; there is no reason this table needs to be joined twice. Both JOIN operations require a match to the deviceID column of goldenvariation, the second join to the devices table does additional filtering with the isDeleted=0
. The keywords INNER
and CROSS
don't have any impact on the statement at all; and the second join to the devices
table isn't really a "cross" join, it's really an inner join. (We prefer to see the join predicates in an ON clause rather than the WHERE clause.
The DATE()
function wrapped around the rundate
column disables an index range scan operation. These predicates can be rewritten to take advantage of an appropriate index.
The DISTINCT(deviceId)
in the SELECT list of an EXISTS subquery is very strange. Firstly, DISTINCT
is a keyword, not a function. There's no need for parens around deviceId
. But beyond that, it doesn't matter what is returned in the SELECT list of the EXISTS subquery, it could just be SELECT 1
.
It's odd to see an EXISTS
predicate with a query that doesn't reference any expression in from the outer query (i.e. a correlated subquery). It's valid syntax. With a correlated subquery, MySQL performs that query for each and every row returned by the outer query. The EXPLAIN output looks like MySQL is doing the same thing, it didn't recognize any optimization.
The way those EXIST predicates are written, if there isn't a 'policy-options' row with 'MISMATCH' AND there isn't a 'policy-options' row with 'MISSING' (for the specified date, then the query will not return any rows. If a row of each type is found (for the specified date, then ALL of 'policy-options' rows for that date are returned. (It's syntactically valid, but it's rather odd.)
Assuming that the id
column on the devices table is UNIQUE (i.e. it's the PRIMARY KEY or there's a UNIQUE index on that column, then the DISTINCT keyword is unnecessary on the outermost query. (From the EXPLAIN output, it looks like MySQL already optimized away the usual operations, that is, MySQL recognized that the DISTINCT keyword is unnecessary.
But bottom line, it's the dependent subqueries that are killing performance; the absence of suitable indexes, and the predicate on the date column wrapped in a function.
To answer your question, yes, this query can be rewritten to return an equivalent resultset more efficiently. (It's not entirely clear that the query is returning the resultset you expect.)
SELECT d1.id AS id27_
, d1.createdTime AS createdT2_27_
, d1.deletedOn AS deletedOn27_
, d1.deviceAlias AS deviceAl4_27_
, d1.deviceName AS deviceName27_
, d1.deviceTypeId AS deviceT21_27_
, d1.equipmentVendor AS equipmen6_27_
, d1.exceptionDetail AS exceptio7_27_
, d1.hardwareVersion AS hardware8_27_
, d1.ipAddress AS ipAddress27_
, d1.isDeleted AS isDeleted27_
, d1.loopBack AS loopBack27_
, d1.modifiedTime AS modifie12_27_
, d1.osVersion AS osVersion27_
, d1.productModel AS product14_27_
, d1.productName AS product15_27_
, d1.routerType AS routerType27_
, d1.rundate AS rundate27_
, d1.serialNumber AS serialN18_27_
, d1.serviceName AS service19_27_
, d1.siteId AS siteId27_
, d1.siteIdA AS siteIdA27_
, d1.status AS status27_
, d1.creator AS creator27_
, d1.lastModifier AS lastMod25_27_
FROM devices d1
JOIN (SELECT g.deviceId
FROM goldenvariation g
CROSS
JOIN (SELECT 1
FROM goldenvariation x3
WHERE x3.goldenVariationType = 'MISMATCH'
AND x3.classType = 'policy-options'
AND x3.rundate >= '2014-04-14'
AND x3.rundate < '2014-04-14' + INTERVAL 1 DAY
LIMIT 1
) t3
CROSS
JOIN (SELECT 1
FROM goldenvariation x4
WHERE x4.goldenVariationType = 'MISSING'
AND x4.classType = 'policy-options'
AND x4.rundate >= '2014-04-14'
AND x4.rundate < '2014-04-14' + INTERVAL 1 DAY
LIMIT 1
) t4
WHERE g.classType = 'policy-options'
AND g.rundate >= '2014-04-14'
AND g.rundate < '2014-04-14' + INTERVAL 1 DAY
GROUP BY g.deviceId
) t2
ON t2.device_id = d1.id
WHERE d1.isDeleted=0
Upvotes: 1
Reputation: 48139
You are looking for elements associated with the golden variations table via EXISTS. I would start with that table to get distinct IDs, then join to your devices table. Also, when converting dates, you won't be able to take advantage of an INDEX (if so part of index).
INDEX... ( classType, rundate, goldenVariationType, deviceID )
CHANGE the date clause to >= ? and < ?+1 this way, you get the entire date range from 12:00:00 morning to 11:59:59pm of the same day and the index can utilize the date component without converting for every record.
Also, you are doing a cross-join to the devices table TWICE on the matching "ID" from the goldenVariations table to devices 1 and 2 on same ID which is wasteful and not doing anything.
Your devices table should have an index ON (id, isDeleted)
SELECT
d1.id AS id27,
d1.createdTime AS createdT2_27,
d1.deletedOn AS deletedOn27,
d1.deviceAlias AS deviceAl4_27_,
d1.deviceName AS deviceName27_,
d1.deviceTypeId AS deviceT21_27_,
d1.equipmentVendor AS equipmen6_27_,
d1.exceptionDetail AS exceptio7_27_,
d1.hardwareVersion AS hardware8_27_,
d1.ipAddress AS ipAddress27_,
d1.isDeleted AS isDeleted27_,
d1.loopBack AS loopBack27_,
d1.modifiedTime AS modifie12_27_,
d1.osVersion AS osVersion27_,
d1.productModel AS product14_27_,
d1.productName AS product15_27_,
d1.routerType AS routerType27_,
d1.rundate AS rundate27_,
d1.serialNumber AS serialN18_27_,
d1.serviceName AS service19_27_,
d1.siteId AS siteId27_,
d1.siteIdA AS siteIdA27_,
d1.status AS status27_,
d1.creator AS creator27_,
d1.lastModifier AS lastMod25_27_
from
( SELECT distinct
gv.deviceID
from
goldenVariation gv
where
gv.classType = 'policy-options'
AND gv.runDate >= '2014-04-14'
AND gv.runDate < '2014-04-15'
AND gv.goldenVariationType IN ( 'MISSING', 'MISMATCH' )) PQ
JOIN devices d1
ON PQ.deviceId = d1.id
AND d1.isDeleted = 0
Upvotes: 1
Reputation: 426
INNER JOIN goldenvariation goldenconf4_
ON goldenconf4_.deviceId = goldenconf0_deviceId
AND (goldenconf4_.goldenVariationType = 'MISSING')
AND (goldenconf4_.classType = 'policy-options')
AND DATE(goldenconf4_.rundate)=DATE('2014-04-14 00:00:00'))
In the same way change another EXISTS
. I think this one should work much faster. Also small tips from me: try to use shorter aliases. Your query is really hard to read.
SELECT DISTINCT
devices1_.id AS id27_,
devices1_.createdTime AS createdT2_27_,
devices1_.deletedOn AS deletedOn27_,
devices1_.deviceAlias AS deviceAl4_27_,
devices1_.deviceName AS deviceName27_,
devices1_.deviceTypeId AS deviceT21_27_,
devices1_.equipmentVendor AS equipmen6_27_,
devices1_.exceptionDetail AS exceptio7_27_,
devices1_.hardwareVersion AS hardware8_27_,
devices1_.ipAddress AS ipAddress27_,
devices1_.isDeleted AS isDeleted27_,
devices1_.loopBack AS loopBack27_,
devices1_.modifiedTime AS modifie12_27_,
devices1_.osVersion AS osVersion27_,
devices1_.productModel AS product14_27_,
devices1_.productName AS product15_27_,
devices1_.routerType AS routerType27_,
devices1_.rundate AS rundate27_,
devices1_.serialNumber AS serialN18_27_,
devices1_.serviceName AS service19_27_,
devices1_.siteId AS siteId27_,
devices1_.siteIdA AS siteIdA27_,
devices1_.status AS status27_,
devices1_.creator AS creator27_,
devices1_.lastModifier AS lastMod25_27_
FROM goldenvariation goldenconf0_
INNER JOIN devices devices1_ ON goldenconf0_.deviceId=devices1_.id
INNER JOIN goldenvariation a on a.deviceId = goldenconf0_.deviceId and a.goldenVariationType = 'MISMATCH'
INNER JOIN goldenvariation b on b.deviceId = goldenconf0_.deviceId and b.goldenVariationType = 'MISSING'
WHERE (goldenconf0_.classType = 'policy-options')
AND convert(date,goldenconf0_.rundate) = '2014-04-14'
AND devices1_.isDeleted=0
Try this one. Should work much faster than your query. You joined table using CROSS JOIN
but not even 1 column from this was used in SELECT
.
Upvotes: 1