Can i set up Mysql to auto-partition?

Question

I want to partition a very large table. As the business is growing, partitioning by date isn't really that good because each year the partitions get bigger and bigger. What I'd really like is a partition for every 10 million records.

The Mysql manual show this simple example:

CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)
PARTITION BY RANGE (store_id) (
PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN MAXVALUE
);

But this means that everything larger than 16 and less than MAXVALUE gets thrown in the last partition. Is there a way to auto-generate a new partition every interval (in my case, 10 million records) so I won't have to keep modifying an active database? I am running Mysql 5.5

Thanks!

EDIT: Here is my actual table

CREATE TABLE `my_table` (
`row_id` int(11) NOT NULL AUTO_INCREMENT,
`filename` varchar(50) DEFAULT NULL,
`timestamp` datetime DEFAULT NULL,
`unit_num` int(3) DEFAULT NULL,
`string` int(3) DEFAULT NULL,
`voltage` float(6,4) DEFAULT NULL,
`impedance` float(6,4) DEFAULT NULL,
`amb` float(6,2) DEFAULT NULL,
`ripple_v` float(8,6) DEFAULT NULL,
 PRIMARY KEY (`row_id`),
 UNIQUE KEY `timestamp` (`timestamp`,`filename`,`string`,`unit_num`),
 KEY `index1` (`filename`),
 KEY `index2` (`timestamp`),
 KEY `index3` (`timestamp`,`filename`,`string`),
 KEY `index4` (`filename`,`unit_num`)
 ) ENGINE=MyISAM AUTO_INCREMENT=690892041 DEFAULT CHARSET=latin1

and an example query for the graph is...

SELECT DATE_FORMAT(timestamp,'%Y/%m/%d %H:%i:%s') as mytime,voltage,impedance,amb,ripple_v,unit_num 
FROM my_table WHERE timestamp >= DATE_SUB('2015-07-31 00:05:59', INTERVAL 90 DAY) 
AND filename = 'dlrphx10s320upsab3' and unit_num='5' and string='2'ORDER BY timestamp asc;

Here is the explain for the query...

mysql> explain SELECT DATE_FORMAT(timestamp,'%Y/%m/%d %H:%i:%s') as mytime,voltage,impedance,amb,ripple_v,unit_num FROM my_table WHERE timestamp >= DATE_SUB('2015-07-31 00:05:59', INTERVAL 90 DAY) AND filename = 'dlrphx10s320upsab3' and unit_num='5' and string='2'ORDER BY timestamp asc;
+----+-------------+------------+------+-------------------------+--------+---------+-------------+-------+----------------------------------------------------+
| id | select_type | table      | type | possible_keys           | key    | key_len | ref         | rows  | Extra                                              |
+----+-------------+------------+------+-------------------------+--------+---------+-------------+-------+----------------------------------------------------+
|  1 | SIMPLE      | unit_tarma | ref  | timestamp,index3,index4 | index4 | 58      | const,const | 13440 | Using index condition; Using where; Using filesort | 
+----+-------------+------------+------+-------------------------+--------+---------+-------------+-------+----------------------------------------------------+

Rick James · Accepted Answer

(This answer is directed at the schema and SELECT.)

Since you anticipate millions of rows, first I want to point out some improvements to the schema.

FLOAT(m,n) is usually the 'wrong' thing to do because it leads to two roundings. Either use plain FLOAT (which seems 'right' for metrics like voltage) or use DECIMAL(m,n). FLOAT is 4 bytes; in the cases given, DECIMAL would be 3 or 4 bytes.
When you have both INDEX(a) and INDEX(a,b), the former is unnecessary since the latter can cover for such. You have 3 unnecessary KEYs. This slows down INSERTs.
INT(3) -- Are you saying a "3-digit number"? If so consider TINYINT UNSIGNED (values 0..255) for 1 byte instead of INT for 4 bytes. This will save many MB of disk space, hence speed. (See also SMALLINT, etc, and SIGNED or UNSIGNED.)
If filename is repeated a lot, you may want to "normalize" it. This would save many MB.
Use NOT NULL unless you need NULL for something.
AUTO_INCREMENT=690892041 implies that you are about 1/3 of the way to disaster with id, which will top out at about 2 billion. Do you use id for anything? Getting rid of the column would avoid the issue; and change the UNIQUE KEY to PRIMARY KEY. (If you do need id, let's talk further.)
ENGINE=MyISAM -- Switching has some ramifications, both favorable and unfavorable. The table would become 2-3 times as big. The 'right' choice of PRIMARY KEY would further speed up this SELECT significantly. (And may or may not slow down other SELECTs.)

A note on the SELECT: Since string and unit_num are constants in the query, the last two fields of ORDER BY timestamp asc, string asc, unit_num asc are unnecessary. If they are relevant for reasons not apparent in the SELECT, then my advice may be incomplete.

This

WHERE filename = 'foobar'
  AND unit_num='40'
  AND string='2' 
  AND timestamp >= ...

is optimally handled by INDEX(filename, unit_name, string, timestamp). The order of the columns is not important except that timestamp needs to be last. Rearranging the current UNIQUE key, you give you the optimal index. (Meanwhile, none of the indexes is very good for this SELECT.) Making it the PRIMARY KEY and the table InnoDB would make it even faster.

Partitioning? No advantage. Not for performance; not for anything else you have mentioned. A common use for partitioning is for purging 'old'. If you intend to do such, let's talk further.

In huge tables it is best to look at all the important SELECTs simultaneously so that we don't speed up one while demolishing the speed of others. It may even turn out that partitioning helps in this kind of tradeoff.

Can i set up Mysql to auto-partition?

Answers (2)

Related Questions