Ashe
Ashe

Reputation: 187

Optimizing SQL query in MySQL

I would like to know why this query takes is slow (about 10 to 20 seconds), the three tables used have 500,000 records, this is the query:

SELECT  *, 'rg_egresos' AS nombre_tabla
    FROM  rg_detallexml DE
    INNER JOIN  rg_egresos EG
    INNER JOIN  rg_emisor EM  ON DE.idContador = EG.id
      AND  DE.idDetalleXml = EG.idDetalleXml
      AND  DE.idContador = EM.idContador
      AND  DE.idDetalleXml = EM.idDetalleXml
    WHERE  DE.idContador = '14894'
      AND  DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01'
                                                      AND '2017-10-31'
      AND  strTipodeComprobante = 'egreso'
      AND  version_xml = '3.2'
      AND  estado_factura = 0
      AND  modificado = 0;

And this is what it shows when I use EXPLAIN

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: EG
         type: index_merge
possible_keys: idx_idDetallexml,idx_estado_factura,idx_modificado,idx_idContador
          key: idx_idContador,idx_estado_factura,idx_modificado
      key_len: 4,4,4
          ref: NULL
         rows: 2111
        Extra: Using intersect(idx_idContador,idx_estado_factura,idx_modificado); Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: DE
         type: eq_ref
possible_keys: PRIMARY,idx_strTipodeComprobante,idx_idContador,idx_version_xml
          key: PRIMARY
      key_len: 4
          ref: db_pwf.EG.idDetalleXml
         rows: 1
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: EM
         type: ref
possible_keys: idx_idContador,idx_idDetallexml
          key: idx_idDetallexml
      key_len: 4
          ref: db_pwf.DE.idDetalleXml
         rows: 1
        Extra: Using where

Can you see a way to improve the query?, I have other queries working with bigger tables and they are faster, all the required fields have its index, thanks.

Table rg_detallexml:

+---------------------------------+--------------+------+-----+---------+----------------+
| Field                           | Type         | Null | Key | Default | Extra          |
+---------------------------------+--------------+------+-----+---------+----------------+
| idDetalleXml                    | int(10)      | NO   | PRI | NULL    | auto_increment |
| UUID                            | varchar(50)  | NO   | MUL | NULL    |                |
| dateFechaSubida                 | varchar(7)   | YES  |     | NULL    |                |
| idContador                      | int(10)      | NO   | MUL | NULL    |                |
| dateFechaHora                   | datetime     | YES  | MUL | NULL    |                |
| dateFechaHoraCertificacion      | datetime     | YES  |     | NULL    |                |
| dateFechaPago                   | datetime     | YES  |     | NULL    |                |
| intFolio                        | int(10)      | YES  |     | NULL    |                |
| strSerie                        | varchar(2)   | YES  |     | A       |                |
| doubleDescuento                 | double       | YES  |     | NULL    |                |
| doubleTotal                     | double       | YES  |     | NULL    |                |
| doubleSubtotal                  | double       | YES  |     | NULL    |                |
| duobleTotalImpuestosTrasladados | double       | YES  |     | NULL    |                |
| doubleTotalImpuestosRetenidos   | double       | YES  |     | NULL    |                |
| doubleTotalRetencionesLocales   | double       | YES  |     | NULL    |                |
| doubleTotalTrasladosLocales     | double       | YES  |     | NULL    |                |
| strTipodeComprobante            | varchar(15)  | YES  | MUL | NULL    |                |
| strMetodoDePago                 | varchar(150) | YES  |     | NULL    |                |
| strFormaDePago                  | varchar(150) | YES  |     | NULL    |                |
| strMoneda                       | varchar(10)  | YES  |     | NULL    |                |
| tipoCambio                      | double       | NO   |     | NULL    |                |
| strLugarExpedicion              | varchar(150) | YES  |     | NULL    |                |
| DIOT                            | int(1)       | YES  |     | 0       |                |
| version_xml                     | varchar(10)  | NO   | MUL | NULL    |                |
+---------------------------------+--------------+------+-----+---------+----------------+

Table rg_egresos:

+---------------------------+--------------+------+-----+---------+----------------+
| Field                     | Type         | Null | Key | Default | Extra          |
+---------------------------+--------------+------+-----+---------+----------------+
| id_egreso                 | int(11)      | NO   | PRI | NULL    | auto_increment |
| id                        | int(11)      | NO   | MUL | NULL    |                |
| idDetalleXml              | int(10)      | NO   | MUL | NULL    |                |
| idCatalogo                | int(19)      | NO   | MUL | NULL    |                |
| tipoCuenta                | int(11)      | NO   | MUL | NULL    |                |
| intRubro                  | int(1)       | NO   |     | NULL    |                |
| RFC                       | varchar(20)  | NO   | MUL | NULL    |                |
| compra_gastos_0_porciento | float        | NO   | MUL | NULL    |                |
| deducible                 | int(1)       | NO   |     | NULL    |                |
| compra_gastos_exentos     | float        | NO   |     | NULL    |                |
| no_deducibles             | float        | NO   |     | NULL    |                |
| estado_factura            | int(11)      | NO   | MUL | NULL    |                |
| fecha                     | date         | NO   | MUL | NULL    |                |
| total_xml                 | double       | NO   |     | NULL    |                |
| subtotal_xml              | double       | NO   |     | NULL    |                |
| iva_xml                   | double       | NO   |     | NULL    |                |
| total_impuestos           | double       | NO   |     | NULL    |                |
| abonado                   | double       | NO   |     | NULL    |                |
| subtotal                  | double       | NO   |     | NULL    |                |
| iva                       | double       | NO   |     | NULL    |                |
| pendiente                 | double       | NO   |     | NULL    |                |
| subtotal_sin_iva          | double       | NO   |     | NULL    |                |
| acreditable               | int(1)       | NO   | MUL | 0       |                |
| fecha_operacion           | datetime     | NO   | MUL | NULL    |                |
| modificado                | int(1)       | NO   | MUL | NULL    |                |
| UUID                      | varchar(50)  | NO   | MUL | NULL    |                |
| IEPS                      | double       | NO   |     | NULL    |                |
| retencion_iva             | double       | NO   |     | NULL    |                |
| retencion_isr             | double       | NO   |     | NULL    |                |
| imp_local                 | double       | NO   |     | 0       |                |
| enviado_a                 | int(11)      | NO   | MUL | NULL    |                |
| enviado_al_iva            | int(1)       | NO   |     | NULL    |                |
| EsNomina                  | int(1)       | NO   | MUL | 0       |                |
| dateFechaPago             | date         | NO   | MUL | NULL    |                |
| nota_credito              | int(1)       | NO   | MUL | NULL    |                |
| extranjero                | int(1)       | NO   | MUL | NULL    |                |
| pago_banco                | int(1)       | NO   | MUL | NULL    |                |
| idBanco_Pago              | int(20)      | NO   | MUL | NULL    |                |
| movimientoPago            | int(10)      | NO   |     | NULL    |                |
| saldo_banco               | varchar(50)  | NO   |     | NULL    |                |
| tipo_pago                 | int(1)       | NO   |     | 0       |                |
| responsable               | varchar(100) | NO   |     | NULL    |                |
+---------------------------+--------------+------+-----+---------+----------------+

Table rg_emisor:

+-----------------+--------------+------+-----+---------+----------------+
| Field           | Type         | Null | Key | Default | Extra          |
+-----------------+--------------+------+-----+---------+----------------+
| idEmisor        | int(10)      | NO   | PRI | NULL    | auto_increment |
| idDetalleXml    | int(10)      | NO   | MUL | NULL    |                |
| idContador      | int(10)      | NO   | MUL | NULL    |                |
| strRFC          | varchar(13)  | NO   |     | NULL    |                |
| strNombreEmisor | varchar(200) | YES  |     | NULL    |                |
| strRegimen      | varchar(250) | YES  |     | NULL    |                |
| strPais         | varchar(40)  | YES  |     | MX      |                |
| strEstado       | varchar(50)  | YES  |     | NULL    |                |
| intCP           | int(5)       | YES  |     | NULL    |                |
| strMunicipio    | varchar(250) | YES  |     | NULL    |                |
| strLocalidad    | varchar(250) | YES  |     | NULL    |                |
| strColonia      | varchar(250) | YES  |     | NULL    |                |
| intNumExt       | int(10)      | YES  |     | NULL    |                |
| intNumInt       | int(10)      | YES  |     | NULL    |                |
| strCalle        | varchar(250) | YES  |     | NULL    |                |
| regimenFiscal   | varchar(20)  | YES  |     | NULL    |                |
+-----------------+--------------+------+-----+---------+----------------+

Upvotes: 4

Views: 114

Answers (3)

Rick James
Rick James

Reputation: 142528

I see two partial Answers in the other replies. Let's tie them together.

Change

  AND  DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01'
                                                  AND '2017-10-31'

to

  AND DE.dateFechaHora >= '2017-10-01'
  AND DE.dateFechaHora  < '2017-10-01' + INTERVAL 1 MONTH

and

If DE is a good starting table:

DE:  INDEX(idContador, strTipodeComprobante, version_xml, dateFechaHora)
       -- date last; others in any order

If EG is a better starting table:

EG:  INDEX(estado_factura, modificado, id)   -- in any order
DE:  INDEX(idContador, idDetalleXml,
           strTipodeComprobante, version_xml, dateFechaHora)

Also have

EM:  INDEX(idContador, idDetalleXml)   -- in either order

"Using intersect" almost always is a clue that you should have a composite index instead of separate indexes. (The separate indexes may be useful for other queries.)

(That is, add all those indexes, then let the Optimizer decide.)

Please use SHOW CREATE TABLE, not the less-descriptive DESCRIBE.

Do you really need SELECT *?

The query, after my suggestions:

SELECT  DE.*,
        EG.*,
        EM.*,
        'rg_egresos' AS nombre_tabla
    FROM  rg_detallexml DE
    INNER JOIN  rg_egresos EG
       ON  DE.idContador = EG.id
      AND  DE.idDetalleXml = EG.idDetalleXml
    INNER JOIN  rg_emisor EM
       ON  DE.idContador = EM.idContador
      AND  DE.idDetalleXml = EM.idDetalleXml
    WHERE  DE.idContador = '14894'
      AND  DE.dateFechaHora >= '2017-10-01'
      AND  DE.dateFechaHora  < '2017-10-01' + INTERVAL 1 MONTH
      AND  DE.strTipodeComprobante = 'egreso'
      AND  DE.version_xml = '3.2'
      AND  EG.estado_factura = 0
      AND  EG.modificado = 0;

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 95101

Now that you've shown the tables, we see that rg_egresos.id is not the table's ID. There can hence be multiple records for one contador in the table. Let's look at the tables and the query more closely:

All tables contain a contador ID and a DetalleXml ID. You want to join them all on these two fields. So you start with the rg_detallexml and get all records for the contador. With the idDetalleXml thus found, you search for rg_egresos and rg_emisors.

This is a bit strange. First of all an rg_detallexml is obviously linked to one contador, but in the other tables the rg_detallexml can be linked to another contador. Well, that may be possible (some kind of from/to relation maybe). But with five rg_egresos records and four rg_emisors records for an rg_detallexml/contador, you'd select thirty records, because you are combining rg_egresos records with rg_emisors records that are not really related.

Anyway: you want to find rg_detallexml quickly.

create index idx_de on rg_detallexml(idcontador, strtipodecomprobante, version_xml,
                                     datefechahora, iddetallexml);

Then you look for rg_egresos:

create index idx_eg on rg_egresos(id, iddetallexml, estado_factura, modificad);

At last you look for rg_emisor:

create index idx_em on rg_emisor(idcontador, iddetallexml);

As the columns are present in all tables, we could of course go through them in any order. Starting with rg_detallexml seems most natural and most restrictive, too, but that is not necessarily best. So you may want to offer the DBMS yet another index:

create index idx_eg2 on rg_egresos(id, estado_factura, modificad, iddetallexml);

which would allow the DBMS to look up the contador's records in this table first and with the added criteria find related iddetallexml here.

Upvotes: 3

fthiella
fthiella

Reputation: 49089

The biggest problem I see is on this part:

DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01' AND '2017-10-31'

is dateFechaHora a datetime field? Why are you converting a datetime field to a string (DATE_FORMAT)? even if you have an index on the dateFechaHora field, it won't be used.

I would suggest you to use this code instead:

and DateFechaHora >= '2017-10-01' and DateFechaHora < '2017-11-01'
                                                       ^^^^^^^^^^

yes it's the following day and it won't be included.

So your query might look like this:

select
  *,
  'rg_egresos' AS nombre_tabla 
from
  rg_detallexml DE inner join rg_egresos EG
  on DE.idContador = EG.id and DE.idDetalleXml = EG.idDetalleXml 
  inner join rg_emisor EM on DE.idContador = EM.idContador 
  and DE.idDetalleXml = EM.idDetalleXml 
where
  DE.idContador = '14894' 
  and dateFechaHora >= '2017-10-01' and dateFechaHora < '2017-11-01' 
  and strTipodeComprobante = 'egreso'
  and version_xml = '3.2' 
  and estado_factura = 0 
  and modificado = 0
;

Upvotes: 1

Related Questions