Reputation: 1889
I have a table with field which is action_time primary key and type is datetime
I try to break it on partitions
ALTER TABLE foo PARTITION BY RANGE (MONTH(action_time))
(
PARTITION p01 VALUES LESS THAN (02) ,
PARTITION p02 VALUES LESS THAN (03) ,
PARTITION p03 VALUES LESS THAN (04) ,
PARTITION p04 VALUES LESS THAN (05) ,
PARTITION p05 VALUES LESS THAN (06) ,
PARTITION p06 VALUES LESS THAN (07) ,
PARTITION p07 VALUES LESS THAN (08) ,
PARTITION p08 VALUES LESS THAN (09) ,
PARTITION p09 VALUES LESS THAN (10) ,
PARTITION p10 VALUES LESS THAN (11) ,
PARTITION p11 VALUES LESS THAN (12) ,
PARTITION p12 VALUES LESS THAN (13) ,
PARTITION pmaxval VALUES LESS THAN MAXVALUE
);
in phpmyadmin I see partitions with rows but when I execute
explain partitions select * from foo where action_time between '2017-01-01 20:34:08' and '2017-01-21 20:34:08';
or
explain partitions select * from foo where action_time > '2017-01-01 20:34:08' && action_time < '2017-01-21 20:34:08'
it hits all partitions (p01,p02,p03,p04,p05,p06,p07,p08,p09,p10,p11,p12,pmaxval)
what I'm doing wrong ?
I also try this way the same result
ALTER TABLE foo
PARTITION BY RANGE( YEAR(action_time) )
SUBPARTITION BY HASH( MONTH(action_time) )
SUBPARTITIONS 12 (
PARTITION p2015 VALUES LESS THAN (2016),
PARTITION p2016 VALUES LESS THAN (2017),
PARTITION p2017 VALUES LESS THAN (2018),
PARTITION p2018 VALUES LESS THAN (2019),
PARTITION p2019 VALUES LESS THAN (2020),
PARTITION p2020 VALUES LESS THAN (2021),
PARTITION p2021 VALUES LESS THAN (2022),
PARTITION p2022 VALUES LESS THAN (2023),
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026),
PARTITION p2026 VALUES LESS THAN (2027),
PARTITION p2027 VALUES LESS THAN (2028),
PARTITION p2028 VALUES LESS THAN (2029),
PARTITION p2029 VALUES LESS THAN (2030),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
I need to break the table by year and month for improve select time, when I'm selecting between dates it sholdn't search in whole table it should search in the relevant partitions. how can I do this?
Upvotes: 4
Views: 3052
Reputation: 9255
MONTH() is not supported for partition pruning. Currently, only four functions are supported by MySQL 5.7/8.0.
In MySQL 8.0, partition pruning is supported for the TO_DAYS(), TO_SECONDS(), YEAR(), and UNIX_TIMESTAMP() functions. See Chapter 5, Partition Pruning, for more information.
You have to use TO_DAYS() instead. e.g.
ALTER TABLE foo PARTITION BY RANGE (TO_DAYS(action_time))
(
PARTITION p01 VALUES LESS THAN (TO_DAYS('2017-02-01')) ,
PARTITION p02 VALUES LESS THAN (TO_DAYS('2017-03-01')) ,
PARTITION pmaxval VALUES LESS THAN MAXVALUE
);
Upvotes: 1
Reputation: 142208
You have found yet another reason why PARTITIONing
is virtually useless.
Supposed you had specified BETWEEN '2015-11-05' AND '2017-02-02'
. Which partitions would it need to hit? All of them.
Supposed you had specified BETWEEN '2015-11-05' AND '2016-02-02'
. Which partitions would it need to hit? 4, but it is not smart enough to wrap around. So it will (I think) hit all.
There are a limited number of patterns (MONTH()
is not one of them) where partitioning will "get it right".
To make BY RANGE( some date )
work, you are limited to BY RANGE(TO_DAYS(date))
(and a few others). But then you have to create a new partition every month (or however often). And, optionally, DROP
the oldest partition.
Now for another reason why you plan is probably useless. What benefit to you expect to get from partitioning? Perhaps performance? Probably won't give you any performance benefit. Let's see your queries so I can explain why.
A simple
SELECT ...
WHERE date >= '...'
AND date < '...' + INTERVAL 20 DAY
will work just as fast with INDEX(date)
as with partitioning. Possibly even faster.
If there is something else in the WHERE
, then that changes everything.
Why PARTITIONing does not speed up simple queries
Let's say you have a simple SELECT
that has a very good index, such as you specify the exact value for the PRIMARY KEY
. (This is called a "point query".)
Case 1: Non-partitioned table. Indexes use a BTree structure. Locating a specific record in a million rows requires drilling down the BTree, which will be about 3 levels deep. For a billion rows, it might be 5 levels.
Case 2: Partitioned table. Partitioning splits the table into multiple tables, each of which have indexes. Locating a specific row will first have to locate the particular partition (sub-table), then drill down the shallower BTree for that partition.
Think if it as (perhaps) removing one level from the BTree, but adding the extra effort of reaching for the partition. The performance difference is minuscule. And it is not clear whether you gain or lose. (Caching, data structures, etc, make this analysis complex.)
Conclusion: For Point Queries, Partitioning never helps, assuming you have a suitable index on the non-partitioned equivalent.
Your particular query is a simple "range" query: WHERE action_time BETWEEN ... AND ...
The optimal table structure (including partitioning and indexing) is
INDEX(action_time)
Another note: If multiple partitions are involved, the SELECT
will fetch rows (if any) from each partition (after pruning), put them together, and then might have to sort the results (depending on other clauses in the SELECT
). Alas there is no parallelism in the execution of the query, so the partitioned variant is more involved, hence, probably slower.
Upvotes: 2