I'm looking to upgrade my MySQL database from 5.7.33 to 8.0.28. I use it with a Laravel app that generates queries using a whereHas method that uses an exists subquery. However, I've noticed that performance has substantially reduced between the two versions - from ~5ms to 4s. Here are the queries: select * from `categories` where exists (select * from `jobs` inner join `category_job` on `jobs`.`id` = `category_job`.`job_id` where `categories`.`id` = `category_job`.`category_id`) order by `name` asc What is interesting if that if I remove the exists subquery entirely, the query returns to a reasonable speed again - around the 5ms mark. select * from `categories` order by `name` asc What I'm trying to understand is why the performances changes so substantially between the two versions - if there is an error in my SQL that could be improved (or perhaps, changed in the framework to avoid the problem) - or this is simply a performance regression in MySQL that I'll have to live with. I get that the first query is more work and going to be slower with the exists subquery, but I don't understand how simply upgrading MySQL has had thie effect. When I run the query with EXPLAINS I get different results between 5.7 and 8.0 as well (note also that the databases have the same indexes): 5.7: id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 PRIMARY categories ALL 27 100.00 Using where; Using filesort 2 DEPENDENT SUBQUERY category_job ref category_job_category_id_index,category_job_job_id_index category_job_category_id_index 8 jobsnearme.categories.id 35253 100.00 2 DEPENDENT SUBQUERY jobs eq_ref PRIMARY PRIMARY 8 jobsnearme.category_job.job_id 1 100.00 Using index 8.0: id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE categories NULL ALL PRIMARY NULL NULL NULL 30 100.00 Using filesort 1 SIMPLE NULL eq_ref <auto_distinct_key> <auto_distinct_key> 8 jobsnearme.categories.id 1 100.00 NULL 2 MATERIALIZED jobs NULL index PRIMARY jobs_company_id_index 9 NULL 718220 100.00 Using index 2 MATERIALIZED category_job NULL ref category_job_category_id_index,category_job_job_id_index category_job_job_id_index 8 jobsnearme.jobs.id 1 100.00 NULL Here is the SHOW CREATE TABLE category_job from 8.0: CREATE TABLE `category_job` ( `id` bigint unsigned NOT NULL AUTO_INCREMENT, `category_id` bigint unsigned NOT NULL, `job_id` bigint unsigned NOT NULL, PRIMARY KEY (`id`), KEY `category_job_category_id_index` (`category_id`), KEY `category_job_job_id_index` (`job_id`), CONSTRAINT `category_job_category_id_foreign` FOREIGN KEY (`category_id`) REFERENCES `categories` (`id`), CONSTRAINT `category_job_job_id_foreign` FOREIGN KEY (`job_id`) REFERENCES `jobs` (`id`) ON DELETE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=1070585 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Reputation: 12450

MySQL exists subquery performs significantly worse in 8.0 than 5.7

I'm looking to upgrade my MySQL database from 5.7.33 to 8.0.28. I use it with a Laravel app that generates queries using a whereHas method that uses an exists subquery. However, I've noticed that performance has substantially reduced between the two versions - from ~5ms to 4s.

Here are the queries:

select * from `categories` where exists (select * from `jobs` inner join `category_job` on `jobs`.`id` = `category_job`.`job_id` where `categories`.`id` = `category_job`.`category_id`) order by `name` asc

What is interesting if that if I remove the exists subquery entirely, the query returns to a reasonable speed again - around the 5ms mark.

select * from `categories` order by `name` asc

What I'm trying to understand is why the performances changes so substantially between the two versions - if there is an error in my SQL that could be improved (or perhaps, changed in the framework to avoid the problem) - or this is simply a performance regression in MySQL that I'll have to live with.

I get that the first query is more work and going to be slower with the exists subquery, but I don't understand how simply upgrading MySQL has had thie effect.

When I run the query with EXPLAINS I get different results between 5.7 and 8.0 as well (note also that the databases have the same indexes):

5.7:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	categories	ALL					27	100.00	Using where; Using filesort
2	DEPENDENT SUBQUERY	category_job	ref	category_job_category_id_index,category_job_job_id_index	category_job_category_id_index	8	jobsnearme.categories.id	35253	100.00
2	DEPENDENT SUBQUERY	jobs	eq_ref	PRIMARY	PRIMARY	8	jobsnearme.category_job.job_id	1	100.00	Using index

8.0:

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	categories	NULL	ALL	PRIMARY	NULL	NULL	NULL	30	100.00	Using filesort
1	SIMPLE		NULL	eq_ref	<auto_distinct_key>	<auto_distinct_key>	8	jobsnearme.categories.id	1	100.00	NULL
2	MATERIALIZED	jobs	NULL	index	PRIMARY	jobs_company_id_index	9	NULL	718220	100.00	Using index
2	MATERIALIZED	category_job	NULL	ref	category_job_category_id_index,category_job_job_id_index	category_job_job_id_index	8	jobsnearme.jobs.id	1	100.00	NULL

Here is the SHOW CREATE TABLE category_job from 8.0:

CREATE TABLE `category_job` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `category_id` bigint unsigned NOT NULL,
  `job_id` bigint unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `category_job_category_id_index` (`category_id`),
  KEY `category_job_job_id_index` (`job_id`),
  CONSTRAINT `category_job_category_id_foreign` FOREIGN KEY (`category_id`) REFERENCES `categories` (`id`),
  CONSTRAINT `category_job_job_id_foreign` FOREIGN KEY (`job_id`) REFERENCES `jobs` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1070585 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Upvotes: 2

Answers (2)

Arie S.

Reputation: 63

Try to set semijoin=off in the optimizer_switch and check the EXPLAIN result

SET optimizer_switch = 'semijoin=off';

Upvotes: 2

Rick James

Reputation: 142298

A common inefficiency in a many-to-many table schema.

This would be faster:

CREATE TABLE `category_job` (
  `category_id` bigint unsigned NOT NULL,
  `job_id` bigint unsigned NOT NULL,
  PRIMARY KEY (category_id, job_id),
  KEY (job_id, category_id)
) ENGINE=InnoDB

Upvotes: -2

MySQL exists subquery performs significantly worse in 8.0 than 5.7

Answers (2)

Related Questions