chinapubmed
chinapubmed

Reputation: 11

How to decide varchar partition RANGE in MySQL 5.5?

Backgroud

I have a very big table, the table is just like this

CREATE TABLE tb_doc (
did mediumint(8) unsigned NOT NULL auto_increment, 
title varchar(80) NOT NULL default '',
...,
PRIMARY KEY  (did), 
KEY title (title)
) 
TYPE=MyISAM;

The type of title is varchar(80), most of the time title will be pure number strings like '111111','2222222','44444444', some times it will be utf-8 strings, like '3a','a4' or "中国" (Chinese characters).

I've already used HASH (did) to do partition, but my SELECT statements are alway like

SELECT did, title,... FROM tb_doc WHERE title= '1111111';
SELECT did, title,... FROM tb_doc WHERE title= '2222222';

So I want to use title to do partition, hope this would be faster. Now it comes the question.

Experiment

I used the following statement:

PARTITION BY RANGE COLUMNS (title)(
PARTITION p00 VALUES LESS THAN (1),         # not pure number strings
PARTITION p01 VALUES LESS THAN (500000),    # pure number strings from 1 to 500k
PARTITION p02 VALUES LESS THAN (1000000),   # pure number strings from 500k to 1000k
PARTITION p03 VALUES LESS THAN (1500000),   # pure number strings from 1000k to 1500k
..........                                  # ......    

PARTITION pn VALUES LESS THAN (25000000),   # the biggest number now
)
;

Similar Questions

I read the following two Q&As: Partitioning a database table in MySQL How to Partitioning a table using a LIKE criteria in Mysql but they are for English world, not work in my situation.

Questions

  1. Use title to do partition is better, right?
  2. Can you give me a "utf-8" RANGE example? I tryied '500000','1000000',...,but they do not work.
  3. If I use SELECT xxx from tb_doc WHERE title='12345', dose MySQL fetch data from partation 1 only?
  4. This table is ~50GB, how many partitions are optimum?

Thank you in advance.

Upvotes: 1

Views: 4465

Answers (2)

Rick James
Rick James

Reputation: 142453

Partitioning by title, even if you could do it, will not speed up

SELECT did, title,... FROM tb_doc WHERE title= '1111111';

For a further discussion of the limitations of PARTITIONing, plus what few use cases it will help, see my blog;

Upvotes: 0

Stoleg
Stoleg

Reputation: 9320

May I note that VARCHAR will have problems with storing characters from multiple languages properly, better use NVARCHAR.

HASH partitioning is used to distribute load over partitions evenly. I would say, that first you should partition by something meaningful to a human (columns appearing in Where clause often) and then do HASH sub-partitioning to utilise as many cores as possible at the same time. So number of HASH sub-partitiong in this case will <= No cores.

I would suggest you creating a clustered index on title column. This will speed up your queries.

And in relation to your questions:

  1. Not neceserally. It will speed up queries, because of clustered index, not partitioning. Use partitioning to manage the table: eg. delete many rows quickly. If a good proportion of your queries looks for many rows (not just 1) or title is not a UNIQUE column, then you may consider Partition

  2. As an example of UTF-8 partition boundary I would say: less then ('c')

  3. Depending on how you define partitioning it may hit 1, several or all partitions.

  4. There is no penalty for having many partitions, but a table in MySQL 5.5+ can have upto 1024 partitions and sub-partitions.

When you whan to do partitioning by string value, use KEY PARTITIONING as described here: 18.2.5. KEY Partitioning.

Example:

CREATE TABLE tm1 (
    s1 CHAR(32) PRIMARY KEY
)
PARTITION BY KEY(s1)
PARTITIONS 10;

Set number of partitions same as there are letters in your alphabet (or all alphabets you anticipate to see in the table) to begin with.

Upvotes: 1

Related Questions