Reputation: 67
We are using the following question to get date out of a large mysql table.
SELECT fullPath, Permissiontype, DinstinguishedName
from cdm.test
where fullPath in
(SELECT distinct fullPath
FROMcdm.test
WHERE (Permissiontype = 'EXPLICIT' and not DinstinguishedName ='')
OR(Permissiontype = 'INHERITED'
AND (length(fullPath) - length(replace(fullPath,'/','')) < 4))
OR(Permissiontype = 'EXPLICIT'
AND NOT DinstinguishedName=''
AND LEFT(fullPath,length(fullPath)-Length(RIGHT(fullPath,INSTR(reverse(fullPath),'/'))))
AND(length(fullPath) - length(replace(fullPath,'/','')) > 2) ))
When I limit the results that need to be shown to 270 it runs really fast, but for example 500 rows it just doesn't run. I have for 1 case 77mil rows in the table (needs to be in 1 table). And then it runs over 8 hours and still doesn't finish. Is there a way to optimize this?
wkr.
Upvotes: 0
Views: 1475
Reputation: 2465
For the combination of IN statement and subqueries mysql has a usefull optimizer called EXISTS() (not specifically meant for this but can be used to optimize queries that use the IN statement in combination with subquery)
According to the reference on https://dev.mysql.com/doc/refman/5.0/en/subquery-optimization-with-exists.html
outer_expr IN (SELECT inner_expr FROM ... WHERE subquery_where)
Would be the same as
EXISTS (SELECT 1 FROM ... WHERE subquery_where AND outer_expr=inner_expr)
But faster
Your query with exist:
SELECT fullPath, Permissiontype, DinstinguishedName
FROM cdm.test cdm1
WHERE EXISTS(SELECT 0 FROM cdm.test cdm2 @wherecondition AND cdm2.fullPath = cdm1.fullPath)
@wherecondition =
WHERE (Permissiontype = 'EXPLICIT' and not DinstinguishedName ='')
OR(Permissiontype = 'INHERITED'
AND (length(fullPath) - length(replace(fullPath,'/','')) < 4))
OR(Permissiontype = 'EXPLICIT'
AND NOT DinstinguishedName=''
AND LEFT(fullPath,length(fullPath)-Length(RIGHT(fullPath,INSTR(reverse(fullPath),'/'))))
AND(length(fullPath) - length(replace(fullPath,'/','')) > 2))
Upvotes: 0
Reputation: 44864
There are several issues which would make the query execution very slow
You are using in
clause and they are bad in many occasions, so better would be convert them to JOIN clause.
Even with the JOIN the inner query has multiple OR
conditions and optimizer can not use the indexes for them.
Why you need the in clause
at all ?
The same query could be written as
select fullPath,Permissiontype,DinstinguishedName from cdm.test
where
Permissiontype = 'EXPLICIT'
and not DinstinguishedName =''
union
select fullPath,Permissiontype,DinstinguishedName from cdm.test
where Permissiontype = 'INHERITED'
AND (length(fullPath) - length(replace(fullPath,'/','')) < 4)
union
select fullPath,Permissiontype,DinstinguishedName from cdm.test
where Permissiontype = 'EXPLICIT'
AND NOT DinstinguishedName=''
AND LEFT(fullPath,length(fullPath)-length(RIGHT(fullPath,INSTR(reverse(fullPath),'/'))))
and(length(fullPath) - length(replace(fullPath,'/','')) > 2)
Note that I have changed all the OR
conditions to union
and this will be faster.
Now add an index if its not added already as
alter table cdm.test add index search_data_idx(Permissiontype,DinstinguishedName,fullPath);
Upvotes: 0
Reputation: 1969
For each record in the test table you're querying the entire table again in the subquery. Instead of using a subquery in the where clause, try an inner join on the same table. This will dramatically improve your performance.
I haven't tried it yet but it could look like:
SELECT fullPath, Permissiontype, DinstinguishedName from cdm.test
INNER JOIN (
SELECT distinct fullPath from cdm.test
where (Permissiontype = 'EXPLICIT' and not DinstinguishedName ='')
or (Permissiontype = 'INHERITED' AND (length(fullPath) - length(replace(fullPath,'/','')) < 4)) OR(Permissiontype = 'EXPLICIT'
AND NOT DinstinguishedName='' AND LEFT(fullPath,length(fullPath)-length(RIGHT(fullPath,INSTR(reverse(fullPath),'/'))))
and(length(fullPath) - length(replace(fullPath,'/','')) > 2) )
) AS SQ1
ON SQ1.fullpath = cdm.test.fullpath
Upvotes: 1