user1894374
user1894374

Reputation: 233

SQL SELECT DISTINCT performance with multiple table JOINs

I am building a database containing information on university courses. Each course can be associated with

My database contains the following tables:

In order to retrieve ALL courses from the database (and the corresponding author,discipline, institution, and level information), I use the following query:

SELECT DISTINCT aut_last, c.cou_id, cou_name, cou_number, cou_year, cou_term, dis_name, ins_name, ins_classification, lev_name
    FROM authorcourse ac1
    INNER JOIN authorcourse ac2        
    ON ac1.cou_id = ac2.cou_id        
    INNER JOIN author a
    ON ac2.aut_id=a.aut_id
    INNER JOIN course c
    ON ac2.cou_id = c.cou_id
    INNER JOIN coursediscipline cd1
    ON ac2.cou_id = cd1.cou_id
    INNER JOIN coursediscipline cd2
    ON cd1.cou_id = cd2.cou_id
    INNER JOIN discipline d
    ON cd2.dis_id = d.dis_id
    INNER JOIN courseinstitution ci1
    ON ac2.cou_id = ci1.cou_id
    INNER JOIN courseinstitution ci2
    ON ci1.cou_id = ci2.cou_id
    INNER JOIN institution i
    ON ci2.ins_id = i.ins_id
    INNER JOIN courselevel cl1
    ON ac2.cou_id = cl1.cou_id
    INNER JOIN courselevel cl2
    ON cl1.cou_id = cl2.cou_id
    INNER JOIN level l
    ON cl2.lev_id = l.lev_id

This query works well when there are 15 courses in the database with “simple” relationships. e.g.:

 cou_name = 'course1', cou_number = 'C1', cou_year = '1999', cou_term = 'summer'
 aut_last = 'Doe1'
 dis_name = 'discipline1'
 ins_name = 'institution1', ins_classification = 'classification1'
 lev_name = 'level1'

-->Showing rows 0 - 14 ( 15 total, Query took 0.0118 sec) EXPLAIN produces the following table:

id select_type table type   possible_keys          key     key_len ref             rows Extra
1  SIMPLE      ac1   index  cou_id                 aut_id  2       NULL            15   Using index; Using  temporary
1  SIMPLE      ac2   ref    PRIMARY,aut_id,cou_id  cou_id  2       ccdb.ac1.cou_id 1    Using index
1  SIMPLE      a     eq_ref PRIMARY                PRIMARY 2       ccdb.ac2.aut_id 1    
1  SIMPLE      c     eq_ref PRIMARY                PRIMARY 2       ccdb.ac2.cou_id 1    Using where
1  SIMPLE      cd1   ref    PRIMARY,cou_id         PRIMARY 2       ccdb.ac1.cou_id 1    Using index
1  SIMPLE      cd2   ref    PRIMARY,cou_id,dis_id  PRIMARY 2       ccdb.ac2.cou_id 1    Using where; Using index
1  SIMPLE      d     eq_ref PRIMARY                PRIMARY 2       ccdb.cd2.dis_id 1    
1  SIMPLE      ci1   ref    PRIMARY,cou_id         PRIMARY 2       ccdb.ac2.cou_id 1    Using where; Using index
1  SIMPLE      ci2   ref    PRIMARY,cou_id,ins_id  PRIMARY 2       ccdb.ac2.cou_id 1    Using where; Using index
1  SIMPLE      i     eq_ref PRIMARY                PRIMARY 2       ccdb.ci2.ins_id 1    
1  SIMPLE      cl1   ref    PRIMARY,cou_id         PRIMARY 2       ccdb.cd1.cou_id 1    Using where; Using index
1  SIMPLE      cl2   ref    PRIMARY,cou_id,lev_id  PRIMARY 2       ccdb.cl1.cou_id 1    Using where; Using index
1  SIMPLE      l     eq_ref PRIMARY                PRIMARY 2       ccdb.cl2.lev_id 1    

Problem: Performance dramatically decreases when there are 15 courses with multiple relationships. Example course:

cou_name = 'course1', cou_number = 'C1', cou_year = '1999', cou_term = 'summer'
aut_last = 'Doe1', 'Doe', 'Doe3', 'Doe4'
dis_name = 'discipline1', 'discipline2', 'discipline3', 'discipline4'
ins_name = 'institution1'(ins_classification = 'classification1'),     'institution2'(ins_classification = 'classification2'), 'institution3'(ins_classification =  'classification3'), 'institution4' (ins_classification = 'classification4')
lev_name = 'level1', 'level2', 'level3', 'level4'

-->Showing rows 0 - 29 ( 3,840 total, Query took 14.7039 sec) EXPLAIN produces the following table:

 id select_type table type   possible_keys         key     key_len ref             rows Extra
 1  SIMPLE      c     ALL    PRIMARY               NULL    NULL    NULL            15   Using temporary
 1  SIMPLE      ac1   ref    PRIMARY,aut_id,cou_id cou_id  2       ccdb.c.cou_id   2    Using index
 1  SIMPLE      a     eq_ref PRIMARY               PRIMARY 2       ccdb.ac1.aut_id 1    
 1  SIMPLE      ac2   ref    cou_id                cou_id  2       ccdb.c.cou_id   2    Using index
 1  SIMPLE      cd1   ref    PRIMARY,cou_id        cou_id  2       ccdb.ac1.cou_id 2    Using where; Using index
 1  SIMPLE      cd2   ref    PRIMARY,cou_id,dis_id cou_id  2       ccdb.c.cou_id   2    Using index
 1  SIMPLE      d     eq_ref PRIMARY               PRIMARY 2       ccdb.cd2.dis_id 1    
 1  SIMPLE      ci1   ref    PRIMARY,cou_id        cou_id  2       ccdb.ac1.cou_id 2    Using where; Using index
 1  SIMPLE      ci2   ref    PRIMARY,cou_id,ins_id cou_id  2       ccdb.ac2.cou_id 2    Using where; Using index
 1  SIMPLE      i     eq_ref PRIMARY               PRIMARY 2       ccdb.ci2.ins_id 1    
 1  SIMPLE      cl1   ref    PRIMARY,cou_id        cou_id  2       ccdb.c.cou_id   2    Using index
 1  SIMPLE      cl2   ref    PRIMARY,cou_id,lev_id cou_id  2       ccdb.ci2.cou_id 2    Using where; Using index
 1  SIMPLE      l     eq_ref PRIMARY               PRIMARY 2       ccdb.cl2.lev_id 1    

When run through my PHP website, I get the following error “Fatal error: Maximum execution time of 30 seconds exceeded in …”

Question: How can I speed up this query? I tried several different combinations of the joins and (as you can see in the EXPLAIN results) I indexed all columns I considered potentially relevant.

Any help would be greatly appreciated.

Upvotes: 2

Views: 2170

Answers (3)

Tom
Tom

Reputation: 6663

You appear to have unnecessary joins in your query. I believe you could get the same by doing the following. It would probably improve your query's performance.

SELECT DISTINCT aut_last, c.cou_id, cou_name, cou_number, cou_year, cou_term, dis_name, ins_name, ins_classification, lev_name
    FROM authorcourse ac 
    INNER JOIN author a
    ON ac.aut_id=a.aut_id
    INNER JOIN course c
    ON ac.cou_id = c.cou_id
    INNER JOIN coursediscipline cd 
    ON ac.cou_id = cd.cou_id
    INNER JOIN discipline d
    ON cd.dis_id = d.dis_id
    INNER JOIN courseinstitution ci 
    ON ac.cou_id = ci.cou_id
    INNER JOIN institution i
    ON ci.ins_id = i.ins_id
    INNER JOIN courselevel cl 
    ON ac.cou_id = cl.cou_id
    INNER JOIN level l
    ON cl.lev_id = l.lev_id

You had redundant joins to the same tables which did not seem to be accomplishing anything.

Upvotes: 1

Mark1inLA
Mark1inLA

Reputation: 196

I dont believe there's enough information to resolve your problem outright. By looking at your explain statement, it looks like you have all the indexes setup correctly, but the number of rows that you're getting is the concern.

The culprit is most likely due to a table scan on a really large table is being compounded by the number of rows you're getting. From my experience when I encounter these issues, I pinpoint where the slowless occurs and work from there. Some strategies to use are temporary tables or subqueries (which break them up into smaller, more managable queries.

Also to get rid of that PHP fatal error problem, you should be able use a try / catch exception block to handle that gracefully.

Upvotes: 0

ManreeRist
ManreeRist

Reputation: 504

IT looks to me asif your pulling all this data for the 'course view' type detail page ?

If so I would say, once a course has been made in the database, how often are the number of authors, disciplines, institutions and levels going to change ?

IF there never going to change from the time there set, then when it is set, also set it in a totally denormalised table like this:

courseView (cou_id, cou_name, cou_number, cou_year, cou_term, data)

.. and in 'data', you just put in a serialised array of all the data. Ugly, but it's going to be fast.

Then, when you search by course id to pull up the one, you can search only one row, one index and pull up all the data instantly.

..

Also, if your going to let people search by Authors, then you can still do this with the normalised table with a simple query like normal.

Upvotes: 1

Related Questions