Reputation: 225
I’m new entry on this tune db, i want to ask for how to optimize the query when the data reach around 35 millions. I’ve been read on several forum to do this below task, but it's not improving my query as per expected.
I hope someone can gave me clue on this :)
My query
SELECT a.msisdn, a.service, a.adn, a.operator, a.channel_subscribe,
a.subscribed_from, b.blacklist_status, b.date_created,
b.status, b.price
FROM `xmp_new`.`subscription` a
INNER JOIN msisdntools.`subscription_import` b ON a.msisdn = b.msisdn
WHERE b.msisdn IN (
SELECT msisdn
FROM `xmp_new`.`subscription`
WHERE msisdn NOT IN (
SELECT msisdn
FROM `msisdntools`.`tbl_blacklist`
WHERE blacklist_status='1'
AND active='1'
)
AND msisdn NOT IN (
SELECT msisdn
FROM subscription_import
WHERE DATE(date_created)=DATE(NOW())
)
AND operator = 'xxxx'
AND subscribed_from BETWEEN DATE('2015-05-16') AND DATE('2015-05-17')
)
AND DATE(b.date_created) = '2012-05-16'
AND b.STATUS = 0
AND b.price IS NULL;
The index of subscription_import
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
subscription_import 0 PRIMARY 1 id A 33136464 (NULL) (NULL) BTREE
subscription_import 1 msisdn 1 msisdn A 6627292 (NULL) (NULL) BTREE
subscription_import 1 service 1 service A 18 (NULL) (NULL) BTREE
subscription_import 1 short_code 1 adn A 18 (NULL) (NULL) BTREE
subscription_import 1 operator 1 operator A 18 (NULL) (NULL) BTREE
subscription_import 1 channel 1 channel A 18 (NULL) (NULL) BTREE
index on subscription tables
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
subscription 0 PRIMARY 1 id A 2179811 (NULL) (NULL) BTREE
subscription 1 msisdn 1 msisdn A 2179811 (NULL) (NULL) BTREE
subscription 1 service 1 service A 19 (NULL) (NULL) BTREE
subscription 1 short_code 1 adn A 19 (NULL) (NULL) BTREE
subscription 1 operator 1 operator A 19 (NULL) (NULL) BTREE
subscription 1 channel 1 channel_subscribe A 19 (NULL) (NULL) BTREE
subscription 1 subscribed_from 1 subscribed_from A 2179811 (NULL) (NULL) BTREE
subscription 1 subscribed_until 1 subscribed_until A 2179811 (NULL) (NULL) BTREE
subscription 1 active 1 active A 19 (NULL) (NULL) YES BTREE
subscription 1 time_created 1 time_created A 2179811 (NULL) (NULL) BTREE
subscription 1 time_updated 1 time_updated A 2179811 (NULL) (NULL) BTREE
Create table command for subscription_import
CREATE TABLE `subscription_import` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`msisdn` varchar(32) NOT NULL,
`service` varchar(64) NOT NULL,
`adn` varchar(8) NOT NULL,
`operator` varchar(32) NOT NULL,
`channel` varchar(16) NOT NULL,
`status` tinyint(4) NOT NULL,
`description` varchar(20) DEFAULT NULL,
`blacklist_status` tinyint(4) NOT NULL,
`date_created` datetime NOT NULL,
`date_modified` datetime NOT NULL,
`price` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `msisdn` (`msisdn`) USING BTREE,
KEY `service` (`service`) USING BTREE,
KEY `short_code` (`adn`) USING BTREE,
KEY `operator` (`operator`) USING BTREE,
KEY `channel` (`channel`) USING BTREE,
KEY `date_created` (`date_created`),
KEY `STATUS` (`status`),
KEY `price` (`price`)
) ENGINE=InnoDB AUTO_INCREMENT=33274291 DEFAULT CHARSET=utf8
create table command for table subscription
CREATE TABLE `subscription` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`transaction_id_subscribe` varchar(64) NOT NULL,
`transaction_id_unsubscribe` varchar(64) NOT NULL,
`msisdn` varchar(32) NOT NULL,
`service` varchar(64) NOT NULL,
`adn` varchar(8) NOT NULL,
`operator` varchar(32) NOT NULL,
`channel_subscribe` varchar(16) NOT NULL,
`channel_unsubscribe` varchar(16) NOT NULL,
`subscribed_from` datetime NOT NULL,
`subscribed_until` datetime NOT NULL,
`partner` varchar(20) NOT NULL,
`active` tinyint(1) DEFAULT NULL,
`time_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`time_updated` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
KEY `msisdn` (`msisdn`) USING BTREE,
KEY `service` (`service`) USING BTREE,
KEY `short_code` (`adn`) USING BTREE,
KEY `operator` (`operator`) USING BTREE,
KEY `channel` (`channel_subscribe`) USING BTREE,
KEY `subscribed_from` (`subscribed_from`) USING BTREE,
KEY `subscribed_until` (`subscribed_until`) USING BTREE,
KEY `active` (`active`) USING BTREE,
KEY `time_created` (`time_created`) USING BTREE,
KEY `time_updated` (`time_updated`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=7868759 DEFAULT CHARSET=utf8
Upvotes: 0
Views: 175
Reputation: 37655
It looks like you're trying to select from a table where a JOIN to each of two other tables fails (there is no matching record in either.)
First establish the subscription records you want:
SELECT ...
FROM subscription b
WHERE DATE(b.date_created) = '2012-05-16'
AND b.status = 0
AND b.price IS NULL
Then you want to exclude cases where there is a qualifying match in tbl_blacklist. So first match the cases where there is a match, but make it an exclusion by saying the matchid id is null
SELECT ...
FROM subscription b
(add)
| LEFT JOIN tbl_blacklist bl ON b.msisdn = bl.msisdn
| AND bl.blacklist_status = '1'
+- AND bl.active = '1'
WHERE DATE(b.date_created) = '2012-05-16'
AND b.status = 0
AND b.price IS NULL
+- AND bl.id IS NULL
Then do the same for the second join:
SELECT ...
FROM subscription b
LEFT JOIN tbl_blacklist bl ON b.msisdn = bl.msisdn
AND bl.blacklist_status = '1'
AND bl.active = '1'
(add)
| LEFT JOIN subscription_import si ON b.msisdn = si.msisdn
|+ AND DATE(si.date_created) = CURRENT_DATE()
WHERE DATE(b.date_created) = '2012-05-16'
AND b.status = 0
AND b.price IS NULL
AND bl.id IS NULL
|+ AND si.id IS NULL
I think I'm at least close enough that you can straighten the sense out.
In my experience MySQL has always been most efficient at JOINs, including OUTER JOINs. Not so much with EXISTS, and especially NOT EXISTS.
Upvotes: 1
Reputation: 142518
Don't use IN ( SELECT ...)
; it is very inefficient. Instead use JOIN
. For NOT IN ( SELECT ... )
, use LEFT JOIN ... WHERE id IS NULL
.
Do not "hide" a column inside a function (whenever possible): AND DATE(b.date_created) = '2012-05-16'
can be turned into AND b.date_created = '2012-05-16'
if date_created
is a DATE. If it is a DATETIME, then do:
AND b.date_created >= '2012-05-16'
AND b.date_created < '2012-05-16' + INTERVAL 1 DAY
With that, INDEX(date_created)
can be used. Even better might be the composite INDEX(price, status, date_created)
.
subscription
will need INDEX(msisdn, blacklist_status, active)
after turning it into a LEFT JOIN
.
DATE(NOW())
is the same as CURDATE()
. But WHERE DATE(date_created) = DATE(NOW())
could probably be WHERE date_created >= CURDATE()
.
subscription_import
will need INDEX(msisdn, date_created)
.
Is msisdn
unique? That is, could it be used as the 'natural' PRIMARY KEY
instead of id
?
Make those changes, performance will be better. Then come back for more advice/abuse.
Upvotes: 3