Kouja
Kouja

Reputation: 183

Optimise DISTINCT . in mysql from a large Mysql table

I have a problem with a query, It make more than 1 minutes and I think the problem come from my distinct

My index: table1 (project_name) table2 (importer_id,sipplier_id) table3 (name)

My table 2 have more than 2 millions rows

Here my query

SELECT DISTINCT table1.id         AS project_id,
                table1.name       AS project_name,
                table3.id         AS supplier_id,
                table4.report_id as report_id,
                table3.name       AS supplier_name
FROM  table1
         LEFT JOIN  table2 ON table2.project_id = table1.id
         LEFT JOIN  table3 ON table3.id = table2.supplier_id
         LEFT JOIN  table4 ON table4.project_id = table1.id
WHERE table1.id IN
      ('2067', '2068', '2069', '2070', '2072', '2073', '2074', '2083', '2084', '2085', '2086', '2087', '2088', '2089',
       '2090', '2091', '2092', '2093', '2094', '2095', '2096', '2097', '2098', '2100', '2101', '2102', '2103', '2104',
       '2105', '2106', '2107', '2108', '2109', '2110', '2111', '2112', '2113', '2114', '2115', '2116', '2117', '2118',
       '2119', '2120', '2121', '2122', '2123', '2124', '2125', '2126', '2127')
  AND (table2.importer_id IN ('215') OR table2.supplier_id IN ('215'))

Explain

Create table statement for all tables (table 1 table 2 table 3 table 4)

CREATE TABLE `table1` (
                                  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
                                  `name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
                                  `client_id` int(11) unsigned NOT NULL,

                                  PRIMARY KEY (`id`),
                                  KEY `FK_wsxix_projects` (`client_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2085 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;



CREATE TABLE `table2` (
                          `id` int(10) unsigned NOT NULL AUTO_INCREMENT,

                          `buyer_id` int(11) unsigned DEFAULT NULL,
                          `client_id` int(11) unsigned DEFAULT NULL,
                          `merchandiser_user_id` int(11) unsigned DEFAULT NULL,
                          `factory_id` int(11) unsigned DEFAULT NULL,
                          `dc_id` int(11) unsigned DEFAULT NULL,
                          `parent_id` int(10) unsigned DEFAULT NULL,
                          `supplier_id` int(11) unsigned NOT NULL,
                          `importer_id` int(11) unsigned NOT NULL,
                          `project_id` int(11) unsigned NOT NULL,
                          `tier_two_id` int(11) unsigned DEFAULT NULL,
                          `store_id` int(11) unsigned DEFAULT NULL,
                          PRIMARY KEY (`id`),
                          KEY `buyer_id` (`buyer_id`),
                          KEY `client_id` (`client_id`),
                          KEY `merchandiser_user_id` (`merchandiser_user_id`),
                          KEY `FK_po_factory_companies_id` (`factory_id`),
                          KEY `FK_po_dc_companies_id` (`dc_id`),
                          KEY `index_po_parentid` (`parent_id`),
                          KEY `index_purchaseorders_cie` (`supplier_id`),
                          KEY `index_poimporter_cie` (`importer_id`),
                          KEY `index_purchaseorders_project` (`project_id`),
                          KEY `facory_importer_id` (`factory_id`,`importer_id`),
                          KEY `posu` (`project_id`,`supplier_id`),
                          KEY `store_id` (`store_id`),
                          KEY `tier_two_id` (`tier_two_id`)

) ENGINE=InnoDB AUTO_INCREMENT=34959 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;






CREATE TABLE `table3` (
                                   `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
                                   `parent_company_id` int(11) DEFAULT NULL,
                                   `name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
                                   `type_id` int(11) unsigned DEFAULT NULL,
                                   `description` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,

                                   `use_majora_majorb` tinyint(1) NOT NULL DEFAULT '0',
                                   `aql_presets_id` int(11) unsigned DEFAULT NULL,
                                   `fabx_presets_id` int(11) unsigned DEFAULT NULL,
                                   `group_id` int(11) unsigned DEFAULT NULL,
                                   `contact_number` varchar(55) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
                                   `unit_number` varchar(55) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
                                   `use_factory_location` tinyint(1) unsigned DEFAULT NULL,
                                   `restrict_users_access_to_sub_companies` tinyint(1) unsigned DEFAULT '0' COMMENT 'Using for share function so far',
                                   PRIMARY KEY (`id`),
                                   KEY `index_wsxix_companies_aql_presets_id` (`aql_presets_id`),
                                   KEY `index_company_type` (`type_id`),
                                   KEY `index_companies_group_id` (`group_id`),
                                   KEY `fk_wsxix_companies_fabx_presets_idx` (`fabx_presets_id`)

) ENGINE=InnoDB AUTO_INCREMENT=992 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;



CREATE TABLE `table4` (
                                         `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
                                         `report_id` int(11) NOT NULL,
                                         `project_id` int(11) unsigned NOT NULL,
                                         PRIMARY KEY (`id`),
                                         KEY `index_project_reports_projects` (`project_id`),
                                         KEY `index_project_reports_reports` (`report_id`),
                                         CONSTRAINT `fk_project_reports_projects` FOREIGN KEY (`project_id`) REFERENCES `wsxix_projects` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
                                         CONSTRAINT `fk_project_reports_reports` FOREIGN KEY (`report_id`) REFERENCES `wsxix_report_types` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=138 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Upvotes: 0

Views: 61

Answers (1)

Willem Renzema
Willem Renzema

Reputation: 5187

Alright, here is my first attempt. Keep in mind that this is a rather difficult query to optimize, given all the OR and IN statements.

Try the following rewritten query. It uses a semi-join to avoid the need for DISTINCT, which may result in a decent performance improvement.

SELECT
table1.id         AS project_id,
table1.name       AS project_name,
table3.id         AS supplier_id,
table4.report_id  AS report_id,
table3.name       AS supplier_name
FROM table1
LEFT OUTER JOIN table3
ON EXISTS (
  SELECT 1 FROM table2
  WHERE table2.project_id = table1.id
  AND table2.supplier_id = table3.id
  AND (table2.importer_id IN ('215') OR table2.supplier_id IN ('215'))
)
LEFT OUTER JOIN table4
ON table4.project_id = table1.id
WHERE table1.id IN
  ('2067', '2068', '2069', '2070', '2072', '2073', '2074', '2083', '2084', '2085', '2086', '2087', '2088', '2089',
   '2090', '2091', '2092', '2093', '2094', '2095', '2096', '2097', '2098', '2100', '2101', '2102', '2103', '2104',
   '2105', '2106', '2107', '2108', '2109', '2110', '2111', '2112', '2113', '2114', '2115', '2116', '2117', '2118',
   '2119', '2120', '2121', '2122', '2123', '2124', '2125', '2126', '2127')

Then, add the following indexes, which should help performance as well:

ALTER TABLE table4
ADD INDEX sowr1 (project_id,report_id)

ALTER TABLE table2
ADD INDEX sowr2 (project_id,supplier_id,importer_id)

Run the query at least twice, and ignore the timing on the first run, so you can avoid the InnoDB buffer cache from giving inaccurate speed results.

Let me know how fast it runs. Then, if it still needs more improvement, provide the EXPLAIN plan for this query, as without your data at hand it is difficult to see exactly how well these changes will work.

Upvotes: 1

Related Questions