StefanoV827
StefanoV827

Reputation: 309

Super-slow MySQL query with WHERE and Index

i have a super slow query without any clue about the reason why it is slow, and some strange behavious on it too.

SELECT <fields list> 
FROM scadenze s 
     JOIN cartelle_cliniche cc ON (cc.id = s.id_cartella) 
     JOIN anagrafica a ON (a.id = cc.id_anagrafica) 
     JOIN scadenze_sedute ss ON (ss.id_scadenza = s.id) 
     JOIN pmultiple_prenotazioni pp ON (pp.id = ss.id_fisico OR pp.id = ss.id_virtuale) 
     LEFT JOIN medici_privati mp ON (mp.id = a.id_medico) 
     LEFT JOIN operatorif o ON (o.id = cc.id_operatore_emittente) 
WHERE pp.confermato = '1' AND pp.annullato = '0' 
      AND pp.id_esito != 4 AND s.id_stato = '0' 
      AND (DATE(s.data_scadenza) BETWEEN '2021-06-01' AND '2021-06-16') 
GROUP BY s.id, TIME(ss.data_seduta)

It lasts 5 minutes!!

Now, as you can see there are a lot of JOINs and only 5 WHERE clauses. If i remove from WHERE clauses these 3 parameters: pp.confermato = '1' AND pp.annullato = '0' AND pp.id_esito != 4, the query lasts only 0.15 seconds. I have single Indexes on these 3 parameters, and i also tried to add a multiple Index using the FORCE INDEX function of MySQL. But nothing.

If instead i put that parameters in an HAVING clause, like:

SELECT <fields list> 
FROM scadenze s 
     JOIN cartelle_cliniche cc ON (cc.id = s.id_cartella) 
     JOIN anagrafica a ON (a.id = cc.id_anagrafica) 
     JOIN scadenze_sedute ss ON (ss.id_scadenza = s.id) 
     JOIN pmultiple_prenotazioni pp ON (pp.id = ss.id_fisico OR pp.id = ss.id_virtuale) 
     LEFT JOIN medici_privati mp ON (mp.id = a.id_medico) 
     LEFT JOIN operatorif o ON (o.id = cc.id_operatore_emittente) 
WHERE s.id_stato = '0' AND (DATE(s.data_scadenza) BETWEEN '2021-06-01' AND '2021-06-16') 
GROUP BY s.id, TIME(ss.data_seduta) 
HAVING pp.confermato = '1' AND pp.annullato = '0' AND pp.id_esito != 4

Query lasts 0.15 seconds. But what's the problem with using these parameters in the WHERE clauses? (In Having clause i get 308 records instead of 309)

What does can cause this slow query if i have Indexes on any of that field?

Thank you in advance.

EDIT:

Explain query here: Explain query

CREATE Table for pmultiple_prenotazione (pp) here:

CREATE TABLE `pmultiple_prenotazioni` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `id_utente` INT(11) NOT NULL,
    `id_anagrafica` INT(11) NOT NULL DEFAULT '0',
    `id_cartella` INT(11) NOT NULL DEFAULT '0',
    `id_kit` INT(11) NOT NULL DEFAULT '0',
    `id_fase` INT(11) NOT NULL DEFAULT '0',
    `data_seduta` DATETIME NOT NULL,
    `giorno` VARCHAR(50) NOT NULL COLLATE 'latin1_swedish_ci',
    `id_reparto` INT(11) NOT NULL,
    `importo_prestazione` DECIMAL(6,2) NOT NULL DEFAULT '0.00',
    `peso_prestazione` DECIMAL(5,2) NOT NULL DEFAULT '0.00',
    `durata_prestazione` SMALLINT(6) NOT NULL DEFAULT '0',
    `id_operatore` INT(11) NOT NULL DEFAULT '0',
    `solo` TINYINT(1) NOT NULL DEFAULT '0',
    `confermato` TINYINT(1) NOT NULL DEFAULT '0',
    `annullato` TINYINT(1) NOT NULL DEFAULT '0',
    `id_esito` TINYINT(4) NOT NULL DEFAULT '0',
    `creazione` DATETIME NOT NULL,
    `univoco` VARCHAR(255) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
    PRIMARY KEY (`id`) USING BTREE,
    INDEX `id_cartella` (`id_cartella`) USING BTREE,
    INDEX `id_kit` (`id_kit`) USING BTREE,
    INDEX `data_seduta` (`data_seduta`) USING BTREE,
    INDEX `confermato` (`confermato`) USING BTREE,
    INDEX `id_utente` (`id_utente`) USING BTREE,
    INDEX `id_reparto` (`id_reparto`) USING BTREE,
    INDEX `id_anagrafica` (`id_anagrafica`) USING BTREE,
    INDEX `id_fase` (`id_fase`) USING BTREE,
    INDEX `peso_prestazione` (`peso_prestazione`) USING BTREE,
    INDEX `durata_prestazione` (`durata_prestazione`) USING BTREE,
    INDEX `solo` (`solo`) USING BTREE,
    INDEX `id_operatore` (`id_operatore`) USING BTREE,
    INDEX `univoco` (`univoco`) USING BTREE,
    INDEX `annullato` (`annullato`) USING BTREE,
    CONSTRAINT `prenkit` FOREIGN KEY (`id_kit`) REFERENCES `starbene_1`.`pmultiple_kit` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=717
;

Upvotes: 1

Views: 87

Answers (3)

Rick James
Rick James

Reputation: 142208

Looks like it might be a case of "inflate-deflate". First it JOINs tables (inflate), then it GROUPs (deflate). In the middle there may be a lot of rows with a lot of work to do all the joining.

So, try to do the GROUP BY first.

SELECT ...
    FROM ( SELECT ... FROM s JOIN ss ...
            GROUP BY s.id, TIME(ss.data_seduta) ) AS aaa
    JOIN ...

You should not need to do a GROUP BY at the end.

Upvotes: 0

Bill Karwin
Bill Karwin

Reputation: 562230

The problem is caused here:

 JOIN pmultiple_prenotazioni pp ON (pp.id = ss.id_fisico OR pp.id = ss.id_virtuale) 

There is no way an index can help this. If you just had one term or the other in this join condition, it could use the primary key index for pp.id. But with the OR expression, it cannot.

You can see the effect in the EXPLAIN where it must do a table-scan (type: ALL) for pp. So the total number of join comparisons is the number of rows examined in ss or 10193 multiplied by the number of rows examined in pp or 23565. Total: 240 million!

You could separate it into two joins:

 LEFT JOIN pmultiple_prenotazioni pp1 ON (pp1.id = ss.id_fisico)
 LEFT JOIN pmultiple_prenotazioni pp2 ON (pp2.id = ss.id_virtuale)
 ...
 WHERE
    pp1.confermato = '1' AND pp1.annullato = '0' AND pp1.id_esito != 4
    AND pp2.confermato = '1' AND pp2.annullato = '0' AND pp2.id_esito != 4
    AND ...

Or you could use a UNION:

SELECT * FROM (
  SELECT s.id, ss.data_seduta
  FROM scadenze s
     JOIN cartelle_cliniche cc ON (cc.id = s.id_cartella)
     JOIN anagrafica a ON (a.id = cc.id_anagrafica)
     JOIN scadenze_sedute ss ON (ss.id_scadenza = s.id)
     JOIN pmultiple_prenotazioni pp ON (pp.id = ss.id_fisico) -- ONLY fisico
     LEFT JOIN medici_privati mp ON (mp.id = a.id_medico)
     LEFT JOIN operatorif o ON (o.id = cc.id_operatore_emittente)
  WHERE
        pp.confermato = '1' AND pp.annullato = '0' AND pp.id_esito != 4
        AND s.id_stato = '0' AND (DATE(s.data_scadenza) BETWEEN '2021-06-01' AND '2021-06-16')

  UNION

  SELECT s.id, ss.data_seduta
  FROM scadenze s
     JOIN cartelle_cliniche cc ON (cc.id = s.id_cartella)
     JOIN anagrafica a ON (a.id = cc.id_anagrafica)
     JOIN scadenze_sedute ss ON (ss.id_scadenza = s.id)
     JOIN pmultiple_prenotazioni pp ON (pp.id = ss.id_virtuale) -- ONLY virtuale
     LEFT JOIN medici_privati mp ON (mp.id = a.id_medico)
     LEFT JOIN operatorif o ON (o.id = cc.id_operatore_emittente)
  WHERE
        pp.confermato = '1' AND pp.annullato = '0' AND pp.id_esito != 4
        AND s.id_stato = '0' AND (DATE(s.data_scadenza) BETWEEN '2021-06-01' AND '2021-06-16')
) AS t
GROUP BY id, TIME(data_seduta);

Upvotes: 1

Kevin Leto
Kevin Leto

Reputation: 15

In MySQL Workbench you can see the execution plan of your query just by clicking on "Execution plan" (bottom right of the page after the execution of the query). Then it should be easy to know which step requires more time.

Upvotes: 0

Related Questions