neeraj
neeraj

Reputation: 345

speed up complex query with joins

I am trying to make this query faster

SELECT
    contacts_cstm.case_id_c AS 'Case_id',
CONCAT(cont.first_name, ' ',cont.last_name) AS 'Contact Name',
contacts_cstm.id_c AS 'Id',
'Contacts' AS 'Module',
reso_tax_preparation_cstm.percentage_paid_c as '% Paid'
[... more cols in select ...]
FROM
contacts cont
INNER JOIN contacts_cstm on contacts_cstm.id_c = cont.id and cont.deleted=0
INNER JOIN leads on leads.contact_id = cont.id AND leads.deleted=0
LEFT JOIN leads_aggre_aggregatevalues_1_c ON leads.id = leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1leads_ida AND leads_aggre_aggregatevalues_1_c.deleted=0
LEFT JOIN aggre_aggregatevalues_cstm ON leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1aggre_aggregatevalues_idb = aggre_aggregatevalues_cstm.id_c 
Left JOIN contacts_ccs_ccs_1_c on contacts_ccs_ccs_1_c.contacts_ccs_ccs_1contacts_ida = cont.id AND contacts_ccs_ccs_1_c.deleted = 0
Left JOIN ccs_ccs on ccs_ccs.id = contacts_ccs_ccs_1_c.contacts_ccs_ccs_1ccs_ccs_idb AND ccs_ccs.deleted = 0
LEFT JOIN ccs_ccs_cstm on ccs_ccs_cstm.id_c = ccs_ccs.id
LEFT JOIN users advocate on advocate.id = contacts_cstm.user_id1_c
LEFT JOIN users practitioner on practitioner.id = contacts_cstm.user_id2_c
LEFT JOIN contacts_reso_tax_preparation_1_c on contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1contacts_ida = cont.id 
LEFT JOIN reso_tax_preparation_cstm on  reso_tax_preparation_cstm.id_c = contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1reso_tax_preparation_idb AND contacts_reso_tax_preparation_1_c.deleted=0
 
WHERE
(DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) >= 14)  
                   
AND (reso_tax_preparation_cstm.percentage_paid_c != '100.00' OR (reso_tax_preparation_cstm.percentage_paid_c = '100.00' AND DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) < 120) ) 

I tried following things

using Explain to see details table contacts_reso_tax_preparation_1_c used 113470 rows .

this is the explained output

enter image description here

so these condition

AND (reso_tax_preparation_cstm.percentage_paid_c != '100.00' OR (reso_tax_preparation_cstm.percentage_paid_c = '100.00' AND DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) < 120) ) 

taking time

  1. moved deleted=0 condition from where condition to join condition

  2. Applied limit 1 to exist queries

  3. innodb to related tables

  4. applied indexed to following columns by these command

    ALTER TABLE reso_tax_preparation_cstm ADD INDEX(percentage_paid_c); ALTER TABLE contacts_cstm ADD INDEX( ctax_status_update_date_c, last_contact_date_c, resolution_type_c, resolution_service_level_c, misc_reso_service_level_c, tax_prep_missing_info_c); ALTER TABLE contacts_cstm ADD INDEX(user_id2_c); ALTER TABLE reso_tax_preparation_cstm ADD INDEX(service_status_c); ALTER TABLE reso_ancillary_services_cstm ADD INDEX(service_status_c); ALTER TABLE contacts_cstm ADD INDEX(missing_info_final_deadline_c);

  5. trying to split these queries into unions ,but count changes.

query still takes 9 seconds

  1. converted left joins to subqueries

    SELECT
     contacts_cstm.case_id_c AS 'Case_id',
      CONCAT( cont.first_name, ' ', cont.last_name ) AS 'Contact Name',
      contacts_cstm.id_c AS 'Id',
      'Contacts' AS 'Module',
       (select percentage_paid_c  AS '% Paid' from  
        contacts_reso_tax_preparation_1_c 
      LEFT JOIN
        reso_tax_preparation_cstm 
        ON reso_tax_preparation_cstm.id_c = contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1reso_tax_preparation_idb 
       AND contacts_reso_tax_preparation_1_c.deleted = 0 
    
        where 
        contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1contacts_ida = cont.id 
       AND
       ( 
           reso_tax_preparation_cstm.percentage_paid_c != '100.00' 
       OR
       (
          reso_tax_preparation_cstm.percentage_paid_c = '100.00' 
          AND DATEDIFF( NOW(), contacts_cstm.last_contact_date_c) < 120 
       )
    )
      limit 1
      ),
    
    DATEDIFF(NOW(), contacts_cstm.last_contact_date_c) AS 'Days since Last Contact',
    (
       SELECT
          COUNT(reqcl_requiredclientinfo.id) 
       FROM
          contacts 
          INNER JOIN
             contacts_cstm 
             ON contacts_cstm.id_c = contacts.id 
          LEFT JOIN
             contacts_reqcl_requiredclientinfo_1_c 
             ON contacts_reqcl_requiredclientinfo_1_c.contacts_reqcl_requiredclientinfo_1contacts_ida = contacts.id 
          LEFT JOIN
             reqcl_requiredclientinfo 
             ON reqcl_requiredclientinfo.id = contacts_reqcl_requiredclientinfo_1_c.contacts_reqcl_requiredclientinfo_1reqcl_requiredclientinfo_idb 
             AND reqcl_requiredclientinfo.deleted = 0 
          LEFT JOIN
             reqcl_requiredclientinfo_cstm 
             ON reqcl_requiredclientinfo_cstm.id_c = reqcl_requiredclientinfo.id 
       WHERE
          contacts.id = cont.id 
    )
    AS 'Number of Required Info Records',
    contacts_cstm.resolution_type_c AS 'Resolution Type',
    CONCAT( advocate.first_name, ' ', advocate.last_name ) AS 'Current Case Advocate',
    CONCAT( practitioner.first_name, ' ', advocate.last_name ) AS 'Current Practitioner',
    DATEDIFF( NOW(), contacts_cstm.ctax_status_update_date_c) AS 'Days in active Ctax Status ',
    contacts_cstm.resolution_service_level_c AS 'Resolution Service Level',
    contacts_cstm.tax_preparation_level_c AS 'Tax Prep Service Level',
    contacts_cstm.misc_reso_service_level_c AS 'Misc Res Service Level',
    ccs_ccs_cstm.ccs_status_c AS `CCS Status` 
    FROM
      contacts cont 
     INNER JOIN
       contacts_cstm 
       ON contacts_cstm.id_c = cont.id 
    INNER JOIN
       leads 
       ON leads.contact_id = cont.id 
    LEFT JOIN
       leads_aggre_aggregatevalues_1_c 
       ON leads.id = leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1leads_ida 
       AND leads_aggre_aggregatevalues_1_c.deleted = 0 
    LEFT JOIN
       aggre_aggregatevalues_cstm 
       ON leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1aggre_aggregatevalues_idb = aggre_aggregatevalues_cstm.id_c 
    LEFT JOIN
       contacts_ccs_ccs_1_c 
       ON contacts_ccs_ccs_1_c.contacts_ccs_ccs_1contacts_ida = cont.id 
       AND contacts_ccs_ccs_1_c.deleted = 0 
    LEFT JOIN
       ccs_ccs 
       ON ccs_ccs.id = contacts_ccs_ccs_1_c.contacts_ccs_ccs_1ccs_ccs_idb 
       AND ccs_ccs.deleted = 0 
    LEFT JOIN
       ccs_ccs_cstm 
       ON ccs_ccs_cstm.id_c = ccs_ccs.id 
    LEFT JOIN
       users advocate 
       ON advocate.id = contacts_cstm.user_id1_c 
    LEFT JOIN
       users practitioner 
       ON practitioner.id = contacts_cstm.user_id2_c 
    

this reduces time and count , but i wanted to be sure this is right

what more can i try

Upvotes: 0

Views: 58

Answers (1)

Rick James
Rick James

Reputation: 142278

  • Avoid un-sargable constructs:

    AND  DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) < 120) 
    

-->

  AND  contacts_cstm.last_contact_date_c > NOW() - INTERVAL 120 DAY
  • OR is hard to Optimize. If you can't get rid of it, we can talk about using UNION.

  • Remove any Joins that do not actually participate in the query.

  • Single-column indexes are sometimes not as good as 'composite' (multi-column) indexes.

  • Some of these indexes may help with the original query andor with the 'subquery' reformulation:

    aggre_aggregatevalues_cstm:  INDEX(id_c)
    ccs_ccs_cstm:  INDEX(id_c,  ccs_status_c)
    contacts_cstm:  INDEX(id_c)
    contacts_cstm:  INDEX(last_contact_date_c, id_c,  case_id_c, user_id1_c, user_id2_c)
    contacts_ccs_ccs_1_c:  INDEX(contacts_ccs_ccs_1contacts_ida,  deleted, contacts_ccs_ccs_1ccs_ccs_idb)
    contacts_reso_tax_preparation_1_c:  INDEX(contacts_reso_tax_preparation_1contacts_ida,  contacts_reso_tax_preparation_1reso_tax_preparation_idb, deleted)
    contacts_reqcl_requiredclientinfo_1_c:  INDEX(contacts_reqcl_requiredclientinfo_1contacts_ida,  contacts_reqcl_requiredclientinfo_1reqcl_requiredclientinfo_idb)
    leads:  INDEX(contact_id,  deleted, id)
    leads:  INDEX(contact_id,  id)
    leads_aggre_aggregatevalues_1_c:  INDEX(leads_aggre_aggregatevalues_1leads_ida,  deleted, leads_aggre_aggregatevalues_1aggre_aggregatevalues_idb)
    reso_tax_preparation_cstm:  INDEX(percentage_paid_c, id_c)
    reqcl_requiredclientinfo_cstm:  INDEX(id_c)
    

When adding a composite index, DROP index(es) with the same leading columns. That is, when you have both INDEX(a) and INDEX(a,b), toss the former.

Upvotes: 1

Related Questions