deep adeshra
deep adeshra

Reputation: 60

Suggestions for speeding up the query

I am facing issues with performance of below pasted query

SELECT
  leads.index AS id,
  leads.customer AS customer,
  leads.email AS email,
  leads.icp_name AS icp_name,
  DATEDIFF(CURRENT_TIMESTAMP(), leads.updated_datetime) AS days_since_contacted,
  leads.contacted AS contacted,
  IFNULL(SUM(rev_sent.opens_count), 0) AS opens,
  leads.responded AS responded,
  rev_replies.category AS category,
  (blackList.domain IS NOT NULL) AS blacklisted,
  leads.first_name AS first_name,
  leads.last_name AS last_name,
  leads.title AS title,
  leads.company_name AS company_name,
  leads.company_industry AS company_industry,
  leads.company_url AS company_url,
  leads.linkedin_handle AS linkedin_handle,
  leads.geo_city AS geo_city,
  leads.geo_state AS geo_state,
  leads.geo_country AS geo_country,
  leads.custom1 as custom1,
  leads.custom2 as custom2,
  leads.custom3 as custom3
FROM
  leads
  LEFT JOIN rev_sent ON leads.customer = rev_sent.customer
  AND leads.email = rev_sent.lead_email
  LEFT JOIN rev_replies ON rev_sent.customer = rev_replies.customer
  AND (
    rev_sent.lead_email = rev_replies.lead_email
    OR rev_sent.thread_id = rev_replies.thread_id
  )
  LEFT JOIN blackList ON leads.customer = blackList.customer
  AND SUBSTRING(leads.email, LOCATE('@', leads.email) + 1) = blackList.domain
WHERE
  leads.customer = ""
  AND leads.icp_name_id = ""
  AND (
    rev_replies.lead_email IS NULL
    OR rev_replies.category IS NOT NULL
  )
GROUP BY
  leads.customer,
  leads.email;

here are the definitions of the tables being used

leads table:

CREATE TABLE
  `leads` (
    `index` int NOT NULL AUTO_INCREMENT,
    `email` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
    `customer` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
    `address` text COLLATE utf8mb4_unicode_ci,
    `bio` text COLLATE utf8mb4_unicode_ci,
    `category_naics_code` text COLLATE utf8mb4_unicode_ci,
    `category_sic_code` text COLLATE utf8mb4_unicode_ci,
    `com_sub_industry` text COLLATE utf8mb4_unicode_ci,
    `company_facebook` text COLLATE utf8mb4_unicode_ci,
    `company_industry` text COLLATE utf8mb4_unicode_ci,
    `company_linked_handle` text COLLATE utf8mb4_unicode_ci,
    `company_logo` text COLLATE utf8mb4_unicode_ci,
    `company_name` text COLLATE utf8mb4_unicode_ci,
    `company_sector` text COLLATE utf8mb4_unicode_ci,
    `company_tech_tags` text COLLATE utf8mb4_unicode_ci,
    `company_twitter` text COLLATE utf8mb4_unicode_ci,
    `company_twitter_bio` text COLLATE utf8mb4_unicode_ci,
    `company_twitter_followers` text COLLATE utf8mb4_unicode_ci,
    `company_type` text COLLATE utf8mb4_unicode_ci,
    `company_url` text COLLATE utf8mb4_unicode_ci,
    `description` text COLLATE utf8mb4_unicode_ci,
    `employee_range` text COLLATE utf8mb4_unicode_ci,
    `employment_seniority` text COLLATE utf8mb4_unicode_ci,
    `estimated_revenues` text COLLATE utf8mb4_unicode_ci,
    `feedback` text COLLATE utf8mb4_unicode_ci,
    `first_name` text COLLATE utf8mb4_unicode_ci,
    `full_name` text COLLATE utf8mb4_unicode_ci,
    `geo_country` text COLLATE utf8mb4_unicode_ci,
    `geo_state` text COLLATE utf8mb4_unicode_ci,
    `geo_city` text COLLATE utf8mb4_unicode_ci,
    `github_handle` text COLLATE utf8mb4_unicode_ci,
    `id` text COLLATE utf8mb4_unicode_ci,
    `last_name` text COLLATE utf8mb4_unicode_ci,
    `linkedin_handle` text COLLATE utf8mb4_unicode_ci,
    `location` text COLLATE utf8mb4_unicode_ci,
    `location1` text COLLATE utf8mb4_unicode_ci,
    `metrics_raised` text COLLATE utf8mb4_unicode_ci,
    `phone_number` text COLLATE utf8mb4_unicode_ci,
    `public_company_revenues` text COLLATE utf8mb4_unicode_ci,
    `qa_status` text COLLATE utf8mb4_unicode_ci,
    `research_worker` text COLLATE utf8mb4_unicode_ci,
    `resume_cv` text COLLATE utf8mb4_unicode_ci,
    `site_phone_number` text COLLATE utf8mb4_unicode_ci,
    `source` text COLLATE utf8mb4_unicode_ci,
    `tags` text COLLATE utf8mb4_unicode_ci,
    `timezone` text COLLATE utf8mb4_unicode_ci,
    `title` text COLLATE utf8mb4_unicode_ci,
    `twitter_followers` text COLLATE utf8mb4_unicode_ci,
    `twitter_following` text COLLATE utf8mb4_unicode_ci,
    `twitter_site` text COLLATE utf8mb4_unicode_ci,
    `utcoffset` text COLLATE utf8mb4_unicode_ci,
    `website` text COLLATE utf8mb4_unicode_ci,
    `year_founded` text COLLATE utf8mb4_unicode_ci,
    `employment_role` text COLLATE utf8mb4_unicode_ci,
    `icp_name` text COLLATE utf8mb4_unicode_ci,
    `icp_name_id` text COLLATE utf8mb4_unicode_ci,
    `icp_name_uuid` text COLLATE utf8mb4_unicode_ci,
    `icp_name_ubico_id` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
    `contacted` int DEFAULT '0',
    `thread_id` text COLLATE utf8mb4_unicode_ci,
    `responded` int DEFAULT '0',
    `count_opens` int DEFAULT '0',
    `updated_datetime` datetime DEFAULT NULL,
    `created_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    `crm_owner_id` text COLLATE utf8mb4_unicode_ci,
    `custom1` text COLLATE utf8mb4_unicode_ci,
    `custom2` text COLLATE utf8mb4_unicode_ci,
    `custom3` text COLLATE utf8mb4_unicode_ci,
    `title_seniority` text COLLATE utf8mb4_unicode_ci,
    `title_function` text COLLATE utf8mb4_unicode_ci,
    PRIMARY KEY (`index`),
    KEY `customer_email` (`customer`, `email`),
    KEY `idx_customer` (`customer`),
    KEY `icp_name_ubico_id_index` (`icp_name_ubico_id`)
  ) ENGINE = InnoDB AUTO_INCREMENT = 1464899 DEFAULT CHARSET = utf8mb4 COLLATE = utf8mb4_unicode_ci

rev_sent table

CREATE TABLE `rev_sent` (
  `index` int NOT NULL AUTO_INCREMENT,
  `message_id` text COLLATE utf8mb4_unicode_ci,
  `thread_id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `subject` text COLLATE utf8mb4_unicode_ci,
  `nylas_email_id` text COLLATE utf8mb4_unicode_ci,
  `nylas_email` text COLLATE utf8mb4_unicode_ci,
  `customer` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `icp_name` text COLLATE utf8mb4_unicode_ci,
  `icp_name_uuid` text COLLATE utf8mb4_unicode_ci,
  `lead_email` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `opens_count` int DEFAULT '0',
  `replies_count` int DEFAULT '0',
  `updated_datetime` datetime DEFAULT NULL,
  `created_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `message_uuid` text COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`index`),
  KEY `customer_lead` (`customer`,`lead_email`),
  KEY `idx_rs_thread` (`thread_id`)
) ENGINE=InnoDB AUTO_INCREMENT=689646 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

and rev_replies table

CREATE TABLE `rev_replies` (
  `index` int NOT NULL AUTO_INCREMENT,
  `message_id` text COLLATE utf8mb4_unicode_ci,
  `thread_id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `nylas_email_id` text COLLATE utf8mb4_unicode_ci,
  `nylas_email` text COLLATE utf8mb4_unicode_ci,
  `customer` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `icp_name` text COLLATE utf8mb4_unicode_ci,
  `icp_name_uuid` text COLLATE utf8mb4_unicode_ci,
  `lead_email` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `subject` text COLLATE utf8mb4_unicode_ci,
  `body` text COLLATE utf8mb4_unicode_ci,
  `category` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `reviewed` tinyint DEFAULT '0',
  `message_datetime` datetime DEFAULT NULL,
  `created_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`index`),
  KEY `customer_lead` (`customer`,`lead_email`),
  KEY `idx_rr_thread` (`thread_id`)
) ENGINE=InnoDB AUTO_INCREMENT=50954 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Below is the EXPLAIN result of the query enter image description here

Basically the 2nd left join with OR condition causes the issue, I am not sure how can I remove that join.

Can one suggest me how can I improve join performance ?

Upvotes: 0

Views: 60

Answers (1)

Rick James
Rick James

Reputation: 142528

leads:  INDEX(icp_name_id, customer, email)
rev_sent:  INDEX(customer,  opens_count, lead_email, thread_id)
rev_replies:  INDEX(customer,  thread_id, lead_email, category)
blackList:  INDEX(customer,  domain)

When adding a composite index, DROP index(es) with the same leading columns. That is, when you have both INDEX(a) and INDEX(a,b), toss the former.

There may be more optimizations... IF...

  • Is (leads.customer, leads.email) UNIQUE?
  • If not, then there could be a problem with the GROUP BY (cf ONLY_FULL_GROUP_BY)
  • If there is a single row for that pair in leads, but the other tables have multiple matching rows, then we need to turn the query inside out.

JOINs are performed before GROUP BY. That is, JOINs often create a bigger intermediate table, only to have the number of rows shrunk back by the GROUP BY. I am hoping to avoid this "explode-implode".

Upvotes: 1

Related Questions