Ye Huang
Ye Huang

Reputation: 639

Why splitting table in MySQL make insert and query slower

My target is to save about 60 Million rows in a MySQL table for high speed read, and properly also continues insert.

For product design, these 60 Million rows can be naturally split into 3000 chunks, thus I decided to make a table sharding strategy to split the one-60M-table into 3000 tables.

I took 3 million data for the following test:

  1. 3 million rows in 1 table: then avg insert of these 3 million data is 80 seconds, and every 1000 queries (each query fetches 1000 rows from this 3-million-data-table) costs about 10 second.

  2. 3 million rows average split into 3000 tables: insert of 3 million data into 3000 tables: 79 seconds (not really quicker); every 1000 queries avg against the 3000 tables (wherein each table has 1000 rows): 120 seconds (12X slower than above)

WHY is that? Although I have 3000 tables, they are basically files managed by MySQL, and each query hits only one table with only 1000 rows, but why is it so slow then?

I am running on a 8-core machine with 15G RAM with following configurations:

open_files_limit 300000
table_open_cache 100000

After 2-3 times of simulation re-try, I also searched the MySQL "openED files" as below, which seems to be OK for my 3000 table setting?

Opened_tables: 9463

How can I get out of this problem?

----------- Edit and more thoughts -----------

I am only trying the table sharding possibility for this moment, maybe MySQL Merge engine can help a little bit in this direction.

On the other hand, maybe partition is not bad a idea neither... with MySQL partition by Range for instance, I may be able to give the Range into 10 million, then the 60M table becomes a table with 6 partition... will the query and insert be both faster?

----------- Update of Trying Table Partition -----------

As also commented as below, instead of Table Sharding, I was thinking of may Table Partition could also be a good solution, especially when it maintains the same table name and has minimal impact on the existing code.

I tried to make 6 partition on this 60 million table;

1) At first, I made something looking like the following pseudocode:

CREATE TABLE `datatable` (  
`id` int(11) NOT NULL AUTO_INCREMENT,  
`type` int(11) NOT NULL DEFAULT 0,  
`description` varchar(255),  
`datimeutc` datetime,  
`datimelocal` datetime,  
`value` double,  
PRIMARY KEY (`id`), 
KEY INDEX_TYPE ON (type)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1  
PARTITION BY RANGE (id) (  
    PARTITION p0 VALUES LESS THAN (10000000),  
    PARTITION p1 VALUES LESS THAN (20000000),  
    PARTITION p2 VALUES LESS THAN (30000000),  
    PARTITION p3 VALUES LESS THAN (40000000),  
    PARTITION p4 VALUES LESS THAN (50000000)  
    PARTITION p5 VALUES LESS THAN MAXVALUE
);

And the results is quite good. It takes about 1 minute to import 3 million data for the testing, and in total 63 minutes to import all 60 million data.

The search time of each queries (which fetches 20000 rows from the 60-M partition based table) is about 90 milliseconds. I don't have any comparative data on the query performance against a single 60 million table, but is the 90 milliseconds a reasonable value?

2) I tried the partition on the field "type" with the hope to limit the incoming individual query on a single partition, since MySQL has limitation on unique key with partition, the pseudocode looks like the following:

CREATE TABLE `datatable` (  
`id` int(11) NOT NULL AUTO_INCREMENT,  
`type` int(11) NOT NULL DEFAULT 0,  
`description` varchar(255),  
`datimeutc` datetime,  
`datimelocal` datetime,  
`value` double,   
KEY (`id`), 
KEY INDEX_TYPE ON (type)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1  
PARTITION BY RANGE (type) (  
    PARTITION p0 VALUES LESS THAN (500),  
    PARTITION p1 VALUES LESS THAN (1000),  
    PARTITION p2 VALUES LESS THAN (1500),  
    PARTITION p3 VALUES LESS THAN (2000),  
    PARTITION p4 VALUES LESS THAN (2500)  
    PARTITION p5 VALUES LESS THAN MAXVALUE
);

At this time, when I insert 60M data, the inserting time is so long compared to the first case. I don't have the results yet, but so far, it takes already 3 hours to insert only 4M data...

Why is that?

I am thinking about, maybe I am inserting 60M in sequence, namely row Id start from 1 to 60000000. So in case one, I basically open and lock the first partition to insert, once the first 10M is inserted, I opened partition two to continue.

On the other hand, in case 2) of partition, I need to frequently and randomly open all 6 partitions (which are designed by 'type' instead of 'id'), so the table lock and unlock took too much time? Could that be the reason?

Upvotes: 1

Views: 2612

Answers (2)

lisnaz
lisnaz

Reputation: 26

Yes splitting table in MySQL is a general good practice for following scenarios:

  1. table grew too large, regular table OP time became unbearable(performance dropped dramatically)
  2. the percent of hot data in the table is relatively a small number
  3. there is a time window on data(data could be archived or purged in timely fashion)
  4. to boost concurrency, in this scenario, data usually distributed in various separated physical servers or different storage systems

In your original post, I think you concerned first scenario mostly, so let's discuss it more.

Why performance would degenerate dramatically when table is very large? and what's the size boundary? It's all about memory. Unless you have purchased FusionIO or any sort of SSD systems, there would always be a steep curve when I/O hits disks. Normally SATA/SAS disk arrays can only performs around 50~200 random IOPS(with write cache protected by BBU), which is too slow when compared to DDR's 200,000+ random IOPS. When MySQL's variables are set to a reasonable value and table size is not lager then cache size, performance is quite good, but when table grow over that limit, degeneration happens. So don't over-optimize table structures unless you know how large they will grow, and tested the system limit throughout. splitting table too early won't show much advantages and performance may even become worse, due to other side-effects that brought up to surface by data fragmentation.

Benchmarks are just like games, you know, they can't really represent real life cases, so we need to regulate game rules. I was curious about your my.cnf setting, especially buffer variables, because the first scenario's performance largely depends on memory caches and disk read/write strategies. The variables are:

  • table_definition_cache: this variable indicate how much table metadata(to MyISAM, they are .frm files) could be stored in memory. It won't help if one table was opened repeatedly, but would help if there are a lot of tables need to be opened(in your case, 3000 tables) if this cache can contain all table's metadata.
  • table_open_cache: this variable indicate how many internal table handlers that MySQL can hold in memory, just like above, it would boost table context switch speed.
  • key_buffer_size: Since you were using MyISAM, this variable would play a very important role in performance. It sets the maximum memory space size that MySQL can allocate for MyISAM tables, the preferred value would be 30% of the system memory if you use MyISAM mostly. Why I took 30% is that there are two things to cache, one is index and another is row data; key_buffer_size represent index, and OS will take care of row data caches(Block I/O Buffer Cache). Leave 30% for index, 50% for row data, 20% for the rest buffer caches like table_*_cache, thread_cache, connection_cache, etc. It looks like this variable won't slow down both cases but who knows, maybe set too small would suffer both cases and multi-tables suffers more.
  • key_cache_block_size: this variable set the size of cache block, which will waste I/O(head/tail over reading) and result in read-around writes(read before write). Multi-table scenario may suffer more because it has more tables(files).

I was also curious about how SQL queries were written, how many thread you were using to read/write to MySQL. For e.g., Write sequentially to one table just feels like sequential write, and the speed is much faster then random write; Write sequentially to 3000 tables feels like random write, and the speed may not be as good as contrary. When 3000 tables were created, there were 3000 .MYI files and 3000 .MYD files, they may not continuous on disk(random I/O would happen), but 1 .MYI and 1 .MYD, they would mostly probably continuous on disk by themselves. This also applies to disk reads. But in your case, reads is much slower than writes, I think maybe that's because writes are buffered, but reads not, if you are selecting rows for the first time. And when reading from one table, MySQL can pre-load key_cache as a whole once, and OS can pre-read next blocks too, because they are continuous; but in multi-tables, MySQL/OS can't do it as a whole. If you can try to spawn more client threads to issue queries, the performance of both cases may become closer.

About your recent updates on partition, yeah I figure you maybe right, partition by 'type' that's pretty much sounds like random I/O when you do batch inserts which SQL data was ordered by primary key but not 'type', plus sub-partition table handler switches.

Upvotes: 1

O. Jones
O. Jones

Reputation: 108686

Three thousand shards? That is FAR too many. The mysqld server is having to scramble to access the data files for your multiple shards, so it is slowing down.

Sixty million rows is a large number for a single table, but it is not too many for the server hardware you describe.

In an application like this the most important reason for partitioning is to make it easier to remove large numbers of outdated rows quickly. If your rows are dated, you can partition, for example, by month.

If you must shard this table, try working with four partitions. But don't shard it unless you are forced to by performance needs. If I were you I'd get the rest of the application working properly. Then, once everything were working, I would evaluate all the system's performance problems (bottleneck) and deal with them in order of severity.

My hunch tells me that it's unlikely this large table is causing serious performance problems.

Upvotes: 1

Related Questions