Reputation: 1
I've been investigating some strange behaviour in a client's AWS RDS instance (MySQL 5.7.26) running two databases. I'm not a DB engineer, so this is going over my head.
Over the year there have been 3 occurrences of the RDS instance size increasing in size over a 2h period. The bumps were in the 1GB
to 8GB
range. For perspective, the RDS instance has 60GB
allocated to it. Here is a screenshot of the cloudwatch logs for one of these occurrences.
The two databases have the same schema, one is used for tests, the other for the live environment. They both consist of one main table of time series data. Here is the schema:
SHOW CREATE TABLE timeSeriesTable;
'CREATE TABLE `timeSeriesTable` (
`id` varchar(50) NOT NULL,
`device_id` int(11) DEFAULT NULL,
`timestamp` datetime DEFAULT NULL,
`server_timestamp` int(11) DEFAULT NULL,
`modified` int(11) DEFAULT NULL,
`deleted` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `sensor_id` (`sensor_id`),
KEY `timestamp` (`timestamp`)
)
ENGINE=InnoDB DEFAULT CHARSET=latin1'
The live table statistics are as follows:
column_count: 6
table_rows: 88749947
data_length: 5.8GB
index_length: 8.3GB
There are many things that can be optimized here, but my main concern at the moment is avoiding future storage issues like these. Other notes:
UPDATE
After a DB reboot, the free storage magically increased by 8GB
.
Upvotes: 0
Views: 825
Reputation: 142208
Perhaps the cloud service was taking a backup?
You don't want to talk about optimization, but that could be the solution, or at least part of it.
INT
takes 4 bytes; consider using a smaller datatype.
What is in the id
? Anything useful? If not, let's discuss getting rid of it; it is bulky, both in the data's BTree and in both secondary indexes' BTrees.
If (sensor_id, timestamp)
is unique, it would probably be better for it to be the PRIMARY KEY
. (That's a "composite" index; the order is important.) This might help some SELECTs
-- in performance and in how much they need to load into RAM to be performed.
I suspect a SELECT
came along and scanned the entire table. And only part of the table was in cache (the buffer_pool) before that. This could easily explain a spike as the buffer pool grows. There is a limit of innodb_buffer_pool_size
, but, I guess, it had not yet expanded to full size before the spike.
More
innodb_buffer_pool_size
might be 12GB. A table scan of that sensor data would suddenly expand the buffer pool if it was not already at 12GB. The is normal.
Let's look at that big SELECT
and see if we can make it less scary.
Upvotes: 0
Reputation: 1512
I'm not 100% sure that there is anything that anyone can do to help with this problem. I have a number of possible suggestions however I'm not sure any of them will be able to be checked or verified through RDS alone:
It does look very much like additional data was being written to the database at the time. There is a spike in IOPS, latency and storage space.
How did you identify that there were no spikes in the volume of data written? If it was a query sharing it may help.
I'd really advise seeking professional support for this. I would definitely not recommend investigating anything on the database instance if you are not trained and confident in doing so.
In my experience tracing database problems is extremely difficult and I'm not sure anyone would be able to help you effectively without direct access to the database and monitoring which is something that you may struggle to be able to find here.
Upvotes: 3