moeseth
moeseth

Reputation: 1945

Why is AWS RDS MYSQL INSERT taking READ IOPS?

I have db.r3.2xlarge with 4000 PIOPS. I'm inserting like 1 billion rows from EC2 instances. There are like 40GB free RAM right now.

Currently, out of 4000 PIOPS, READ PIOPS is taking 3000 and I'm only getting 1000 WRITE PIOPS. So, it's been a low writing.

How do i check which is taking READ PIOPS? And how to speed thing up?

Thank you.

Edit:

insert ignore into dna (hash, time, song_id) values (b%s, b%s, %s)

I'm using self.cursor.executemany(query, rows) from python

hash + time + song_id is a composite primary key.

I'm using AWS RDS InnoDB.

I have 4000 PIOPS. However, it is now stuck at 2000 total. I have 60MB/s WRITE THROUGHPUT.

Upvotes: 4

Views: 854

Answers (1)

Michael - sqlbot
Michael - sqlbot

Reputation: 179084

If the hash is your primary key or is indexed, you're not inserting in primary my and/or index order.

Also, you're using INSERT IGNORE, which suggests you are trying to avoid the inevitable duplicate key error because there's duplicate data among what you're inserting.

For both of these reasons, InnoDB has to do a lot of readying to load the appropriate pages from the tablespaces on disk into memory to find the spot(s) in the primary and/or any secondary indexes where the next row needs to go, which may turn out to be wasted effort if the row is a duplicate, and may turn out to require a page split so that space is available to randomly insert the next hash into its proper place.

If hash is the primary key, it would probably be to your advantage to drop all other indexes while inserting, then add them at the end, where they can be built more efficiently.

Pre-sorting the inserts by hash should help, some, if the batches are large enough and hash is indeed the primary key.

Upvotes: 2

Related Questions