GordyD
GordyD

Reputation: 5103

Optimise MySql query using temporary and filesort

I have this query (shown below) which currently uses temporary and filesort in order to generate a grouped by set of ordered results. I would like to get rid of their usage if possible. I have looked into the underlying indexes used in this query and I just can't see what is missing.

SELECT 
  b.institutionid AS b__institutionid,
  b.name AS b__name,  
  COUNT(DISTINCT f2.facebook_id) AS f2__0 
FROM education_institutions b 
LEFT JOIN facebook_education_matches f ON b.institutionid = f.institutionid 
LEFT JOIN facebook_education f2 ON f.school_uid = f2.school_uid 
WHERE 
  (
  b.approved = '1' 
  AND f2.facebook_id IN ( [lots of facebook ids here ])
  ) 
GROUP BY b__institutionid 
ORDER BY f2__0 DESC
LIMIT 10

Here is the output for EXPLAIN EXTENDED :

+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                  | key            | key_len | ref                              | rows | filtered | Extra                                        |
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | f     | index  | PRIMARY,institutionId          | institutionId  | 4       | NULL                             |  308 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | f2    | ref    | facebook_id_idx,school_uid_idx | school_uid_idx | 9       | f.school_uid                     |    1 |   100.00 | Using where                                  |
|  1 | SIMPLE      | b     | eq_ref | PRIMARY                        | PRIMARY        | 4       | f.institutionId                  |    1 |   100.00 | Using where                                  |
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+

The CREATE TABLE statements for each table are shown below so you know the schema.

CREATE TABLE facebook_education (
  education_id int(11) NOT NULL AUTO_INCREMENT,
  name varchar(255) DEFAULT NULL,
  school_uid bigint(20) DEFAULT NULL,
  school_type varchar(255) DEFAULT NULL,
  year smallint(6) DEFAULT NULL,
  facebook_id bigint(20) DEFAULT NULL,
  degree varchar(255) DEFAULT NULL,
  PRIMARY KEY (education_id),
  KEY facebook_id_idx (facebook_id),
  KEY school_uid_idx (school_uid),
  CONSTRAINT facebook_education_facebook_id_facebook_user_facebook_id FOREIGN KEY (facebook_id) REFERENCES facebook_user (facebook_id)
) ENGINE=InnoDB AUTO_INCREMENT=484 DEFAULT CHARSET=utf8;

CREATE TABLE facebook_education_matches (
  school_uid bigint(20) NOT NULL,
  institutionId int(11) NOT NULL,
  created_at timestamp NULL DEFAULT NULL,
  updated_at timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (school_uid),
  KEY institutionId (institutionId),
  CONSTRAINT fk_facebook_education FOREIGN KEY (school_uid) REFERENCES facebook_education (school_uid) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT fk_education_institutions FOREIGN KEY (institutionId) REFERENCES education_institutions (institutionId) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT;

CREATE TABLE education_institutions (
  institutionId int(11) NOT NULL AUTO_INCREMENT,
  name varchar(100) NOT NULL,
  type enum('School','Degree') DEFAULT NULL,
  approved tinyint(1) NOT NULL DEFAULT '0',
  deleted tinyint(1) NOT NULL DEFAULT '0',
  normalisedName varchar(100) NOT NULL,
  created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (institutionId)
) ENGINE=InnoDB AUTO_INCREMENT=101327 DEFAULT CHARSET=utf8;

Any guidance would be greatly appreciated.

Upvotes: 4

Views: 2285

Answers (2)

gbn
gbn

Reputation: 432180

The filesort probably happens because you have no suitable index for the ORDER BY

It's mentioned in the MySQL "ORDER BY Optimization" docs.

What you can do is load a temp table, select from that afterwards. When you load the temp table, use ORDER BY NULL. When you select from the temp table, use ORDER BY .. LIMIT

The issue is that group by adds an implicit order by <group by clause> ASC unless you disable that behavior by adding a order by null.
It's one of those MySQL specific gotcha's.

Upvotes: 4

Vivek Viswanathan
Vivek Viswanathan

Reputation: 1963

I can see two possible optimizations,

  1. b.approved = '1' - You definitely need an index on approved column for quick filtering.

  2. f2.facebook_id IN ( [lots of facebook ids here ]) ) - Store the facebook ids in a temp table,. Then create an index on the temp table and then join with the temp table instead of using IN clause.

Upvotes: 0

Related Questions