Long
Long

Reputation: 83

Mysql left join take too much time

Can anyone help me to know why is my query take too much time to excute. I have 2 table, transaction_log and purchasing_receipt.

Here is the DDL:

-- payment.purchasing_receipt definition

CREATE TABLE `purchasing_receipt` (
  `purchasing_receipt_id` bigint NOT NULL AUTO_INCREMENT COMMENT 'Purchasing Receipt Id',
  `processing_id` bigint DEFAULT NULL COMMENT 'Processing Id',
  `merchant_no` varchar(4) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Merchant Number',
  `receipt_no` varchar(25) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Receipt No',
  `ext_store_cd` varchar(25) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'External Store Code',
  `total_amount` decimal(12,0) DEFAULT NULL COMMENT 'Total Amount',
  `purchasing_date` datetime DEFAULT NULL COMMENT 'Purchasing Date',
  `created_date` datetime DEFAULT NULL COMMENT 'Created Date',
  `created_by` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Created By',
  `updated_date` datetime DEFAULT NULL COMMENT 'Updated Date',
  `updated_by` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Updated By',
  `ext_store_name` varchar(200) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'External Store Name',
  `pos_number` varchar(25) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'POS Number',
  `management_no` varchar(89) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Management Number',
  PRIMARY KEY (`purchasing_receipt_id`),
  KEY `purchasing_receipt_processing_idx` (`processing_id`),
  KEY `ix_batch_p` (`processing_id`,`purchasing_date`,`total_amount`,`ext_store_cd`,`merchant_no`),
  CONSTRAINT `purchasing_receipt_processing` FOREIGN KEY (`processing_id`) REFERENCES `batch_processing` (`processing_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=2019121320000000015 DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Purchasing Receipt';

and:

-- payment.transaction_log definition

CREATE TABLE `transaction_log` (
  `transaction_id` char(20) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'Transaction Id',
  `transaction_date` datetime DEFAULT NULL COMMENT 'Transaction Date',
  `transaction_status` varchar(1) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'Transaction Status',
  `transaction_amount` decimal(8,0) DEFAULT NULL COMMENT 'Transaction Amount',
  `consumer_user_profile_id` bigint DEFAULT NULL COMMENT 'Consumer User Profile Id',
  `consumer_pay_profile_id` bigint DEFAULT NULL COMMENT 'Consumer Pay Profile Id',
  `merchant_no` varchar(4) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Merchant Number',
  `merchant_name_kanji` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Merchant Name Kanji',
  `merchant_name_kana` varchar(128) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Merchant Name Kana',
  `affiliated_store_no` varchar(9) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Affiliated Store Number',
  `affiliated_store_name_kanji` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Affiliated Store Name Kanji',
  `affiliated_store_name_kana` varchar(128) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Affiliated Store Name Kana',
  `terminal_no` varchar(13) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Terminal Number',
  `cancelled_flag` char(1) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'Cancelled Flag',
  `created_date` datetime DEFAULT NULL COMMENT 'Created Date',
  `created_by` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'Created By',
  `updated_date` datetime DEFAULT NULL COMMENT 'Updated Date',
  `updated_by` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'Updated By',
  `matched_flag` char(1) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `report_year_month` char(6) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `app_user_id` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `merchant_pay_name` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `installed_store_no` varchar(9) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL COMMENT 'Installed Store Number',
  `installed_store_name_kanji` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `installed_store_name_kana` varchar(128) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `usage_item_flag` char(1) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `processing_id` bigint DEFAULT NULL,
  PRIMARY KEY (`transaction_id`),
  KEY `transaction_log_sequence_idx` (`transaction_id`),
  KEY `con_idx` (`consumer_user_profile_id`),
  KEY `con_idx1` (`consumer_pay_profile_id`),
  KEY `FK_idx` (`terminal_no`),
  KEY `FK_idx1` (`processing_id`),
  KEY `optimize_batch_t` (`transaction_id`,`transaction_date`,`transaction_amount`,`installed_store_no`,`merchant_no`,`matched_flag`),
  CONSTRAINT `FK_con_pay` FOREIGN KEY (`consumer_pay_profile_id`) REFERENCES `consumer_pay_profile` (`consumer_pay_profile_id`),
  CONSTRAINT `FK_con_user` FOREIGN KEY (`consumer_user_profile_id`) REFERENCES `consumer_user_profile` (`consumer_user_profile_id`),
  CONSTRAINT `FK_process` FOREIGN KEY (`processing_id`) REFERENCES `batch_processing` (`processing_id`),
  CONSTRAINT `FK_terminal` FOREIGN KEY (`terminal_no`) REFERENCES `pos_terminal_profile` (`terminal_no`),
  CONSTRAINT `FK_tran_sequence` FOREIGN KEY (`transaction_id`) REFERENCES `transaction_id_sequence` (`transaction_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Transaction Log';

I've indexed the column for two table:

CREATE INDEX optimize_batch_t
ON transaction_log(transaction_date, transaction_amount, merchant_no, matched_flag, processing_id);

CREATE INDEX ix_batch_p
ON purchasing_receipt(purchasing_date, total_amount, merchant_no, processing_id);

And here is the query:

SELECT p.purchasing_receipt_id
            , CASE WHEN count(t.transaction_id) = 1 THEN min(t.transaction_id) ELSE NULL END 
            , CASE WHEN count(t.transaction_id) = 1 THEN 'A' ELSE 'U' END 
FROM purchasing_receipt p 
LEFT JOIN transaction_log t 
            ON t.transaction_date BETWEEN DATE_SUB(p.purchasing_date, INTERVAL 2 MINUTE) AND DATE_ADD(p.purchasing_date, INTERVAL 2 MINUTE) 
            AND t.transaction_amount = p.total_amount 
            AND p.merchant_no = t.merchant_no 
            AND t.matched_flag = 'N' 
WHERE p.processing_id = 783 GROUP BY p.purchasing_receipt_id ;

I've check and it show that the query use correct index. enter image description here The transaction_log table has 261690 record. And Purchasing_receipt has 100063 record. And the query take 128 minutes to complete.

I've stucked with this issue a few days and still dont understand why the query execute is too slow like that.

Upvotes: 1

Views: 174

Answers (1)

Rick James
Rick James

Reputation: 142208

Start by making sure to use the same COLLATION in JOINs:

AND p.merchant_no = t.merchant_no 

`merchant_no` varchar(4)  CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
    DEFAULT NULL COMMENT 'Merchant Number',
`merchant_no` varchar(4)  CHARACTER SET utf8    COLLATE utf8_bin
    DEFAULT NULL COMMENT 'Merchant Number',

Fixing that may not be the whole solution.

What do you think this means? count(t.transaction_id) = 1 ? It means "count the number of rows (in each GROUP) where t.transaction_id is NOT NULL. Can there be 2 such rows? If so, that =1 will be 'false'.

Ranges with the BETWEEN being variables is hard to hard to optimizer. (Alas, I have no remedy.)

ON t.transaction_date
         BETWEEN DATE_SUB(p.purchasing_date, INTERVAL 2 MINUTE)
             AND DATE_ADD(p.purchasing_date, INTERVAL 2 MINUTE) 

But, we can probably get rid of the GROUP BY by changing the LEFT JOIN into

EXISTS ( SELECT 1
           FROM transaction_log t 
           WHERE  t.transaction_date
                     BETWEEN DATE_SUB(p.purchasing_date, INTERVAL 2 MINUTE)
                         AND DATE_ADD(p.purchasing_date, INTERVAL 2 MINUTE) 
              AND t.transaction_amount = p.total_amount 
              AND p.merchant_no = t.merchant_no 
              AND t.matched_flag = 'N' 
       )

Now, let's make an index for t:

INDEX(matched_flag, merchant_no, transaction_amount,   -- in any order
      transaction_date)  -- put the range last

You have two indexes with transaction_date before the end. Usually the utility of an index terminates with a "range". This makes your indexes not as useful as they could be.

OK, I hit a roadblock. You need the EXISTS twice. That takes you back to LEFT JOIN. But I don't want to get into the "inflate-deflate" problem. This where a JOIN leads to multiple rows (in a temp table), only to have a GROUP BY deflate back to what you started with. I'll assume the worst, namely that there can be multiple transactions in the 4 minute period for some receipts.

    SELECT  purchasing_receipt_id,
            min_tid,
            IF(min_tid IS NULL, 'U', 'A')
        FROM (
            SELECT p.purchasing_receipt_id,
                   ( SELECT min(t.transaction_id)
                       FROM transaction_log t  
                       WHERE  t.transaction_amount = p.total_amount
                         AND  t.merchant_no = p.merchant_no
                         AND  t.matched_flag = 'N'
                         AND  t.transaction_date
                                 BETWEEN p.purchasing_date - INTERVAL 2 MINUTE 
                                     AND p.purchasing_date + INTERVAL 2 MINUTE
                   )   AS min_tid
            WHERE  p.processing_id = 783
             ) AS x ;        

(I don't want to ask what this is about: AUTO_INCREMENT=2019121320000000015 .)

Upvotes: 1

Related Questions