yaodongen
yaodongen

Reputation: 31

split table performance in mysql

everyone. Here is a problem in my mysql server.
I have a table about 40,000,000 rows and 10 columns.
Its size is about 4GB.And engine is innodb.
It is a master database, and only execute one sql like this.

insert into mytable ... on duplicate key update ...

And about 99% sqls executed update part.

Now the server is becoming slower and slower. I heard that split table may enhance its performance. Then I tried on my personal computer, splited into 10 tables, failed , also tried 100 ,failed too. The speed became slower instead. So I wonder why splitting tables didn't enhance the performance?

Thanks in advance.

more details:

CREATE TABLE my_table (
    id BIGINT AUTO_INCREMENT,
    user_id BIGINT,
    identifier VARCHAR(64),
    account_id VARCHAR(64),
    top_speed INT UNSIGNED NOT NULL,
    total_chars INT UNSIGNED NOT NULL,
    total_time INT UNSIGNED NOT NULL, 
    keystrokes  INT UNSIGNED NOT NULL,
    avg_speed INT UNSIGNED NOT NULL,
    country_code VARCHAR(16), 
    update_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY(id), UNIQUE KEY(user_id)
);

PS:
I also tried different computers with Solid State Drive and Hard Disk Drive, but didn't help too.

Upvotes: 0

Views: 562

Answers (2)

Rick James
Rick James

Reputation: 142296

Splitting up a table is unlikely to help at all. Ditto for PARTITIONing.

Let's count the disk hits. I will skip counting non-leaf nodes in BTrees; they tend to be cached; I will count leaf nodes in the data and indexes; they tend not to be cached.

IODKU does:

  1. Read the index block containing the for any UNIQUE keys. In your case, that is probably user_id. Please provide a sample SQL statement. 1 read.
  2. If the user_id entry is found in the index, read the record from the data as indexed by the PK(id) and do the UPDATE, and leave this second block in the buffer_pool for eventual rewrite to disk. 1 read now, 1 write later.
  3. If the record is not found, do INSERT. The index block that needs the new row was already read, so it is ready to have a new entry inserted. Meanwhile, the "last" block in the table (due to id being AUTO_INCREMENT) is probably already cached. Add the new row to it. 0 reads now, 1 write later (UNIQUE). (Rewriting the "last" block is amortized over, say, 100 rows, so I am ignoring it.)
  4. Eventually do the write(s).

Total, assuming essentially all take the UPDATE path: 2 reads and 1 write. Assuming the user_id follows no simple pattern, I will assume that all 3 I/Os are "random".

Let's consider a variation... What if you got rid of id? Do you need id anywhere else? Since you have a UNIQUE key, it could be the PK. That is replace your two indexes with just PRIMARY KEY(user_id). Now the counts are:

  1. 1 read
  2. If UPDATE, 0 read, 1 write
  3. If INSERT, 0 read, 0 write

Total: 1 read, 1 write. 2/3 as many as before. Better, but still not great.

Caching

How much RAM do you have?
What is the value of innodb_buffer_pool_size?
SHOW TABLE STATUS -- What are Data_length and Index_length?

I suspect that the buffer_pool is not big enough, and possible could be raised. If you have more than 4GB of RAM, make it about 70% of RAM.

Others

SSDs should have helped significantly, since you appear to be I/O bound. Can you tell whether you are I/O-bound or CPU-bound?

How many rows are you updating at once? How long does it take? Is it batched, or one at a time? There may be a significant improvement possible here.

Do you really need BIGINT (8 bytes)? INT UNSIGNED is only 4 bytes.

Is a transaction involved?

Is the Master having a problem? The Slave? Both? I don't want to fix the Master in such a way that it messes up the Slave.

Upvotes: 1

songlei.wang
songlei.wang

Reputation: 381

Try to split your database into some mysql instances using mysql proxy just like mysql-proxy or haproxy instead of one mysql instance. Maybe you can have great performance.

Upvotes: 0

Related Questions