Reputation: 74
I have to tables with 65.5 Million rows: 1)
CREATE TABLE RawData1 (
cdasite varchar(45) COLLATE utf8_unicode_ci NOT NULL,
id int(20) NOT NULL DEFAULT '0',
timedate datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
type int(11) NOT NULL DEFAULT '0',
status int(11) NOT NULL DEFAULT '0',
branch_id int(20) DEFAULT NULL,
branch_idString varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (id,cdasite,timedate),
KEY idx_timedate (timedate,cdasite)
) ENGINE=InnoDB;
2) Same table with partition (call it RawData2)
PARTITION BY RANGE ( TO_DAYS(timedate))
(PARTITION p20140101 VALUES LESS THAN (735599) ENGINE = InnoDB,
PARTITION p20140401 VALUES LESS THAN (735689) ENGINE = InnoDB,
.
.
PARTITION p20201001 VALUES LESS THAN (738064) ENGINE = InnoDB,
PARTITION future VALUES LESS THAN MAXVALUE ENGINE = InnoDB);
I'm using the same query:
SELECT count(id) FROM RawData1
where timedate BETWEEN DATE_FORMAT(date_sub(now(),INTERVAL 2 YEAR),'%Y-%m-01') AND now();
2 problems: 1. why the partitioned table runs longer then the regular table? 2. the regular table returns 36380217 in 17.094 Sec. is it normal, all R&D leaders think it is not fast enough, it need to return in ~2 Sec.
What do I need to check / do / change ? Is it realistic to scan 35732495 rows and retrieve 36380217 in less then 3-4 sec?
Upvotes: 0
Views: 86
Reputation: 142218
PARTITIONing
is not a performance panacea.id
come from?cdasite
? If thousands, not millions, build a table mapping cdasite <=> id and switch from a bulky VARCHAR(45)
to a MEDIUMINT UNSIGNED
(or whatever is appropriate). This item may help the most, but perhaps not enough.status
, but probably using TINYINT UNSIGNED
. Or think about ENUM. Either is 1 byte, not 4.(20)
on INT(20)
means nothing. You get a 4-byte integer with a limit of about 2 billion.timedates
?branch_id
and branch_idString
-- this smells like a pair that needs to be in another table, leaving only the id here?COUNT(*)
is the same as COUNT(id)
since id
is NOT NULL
.To get that query even faster, build and maintain a Summary Table. It would have at least a DATE
in the PRIMARY KEY
and at least COUNT(*)
as a column. Then the query would fetch from that table. More on Summary tables: http://mysql.rjweb.org/doc.php/summarytables
Upvotes: 1