Reputation: 2451
We are trying to reduce the load on our AWS RDS instance by going through the performance insights.
The top SQL statement listed in the insight is "commit":
In particular, the commit statement spends most of its time in the "init" state:
When looking at the overall load by states, this "init" state indeed constitutes a large chunk of the load:
So, now, we are not too sure what to derive from this information.
sync_binlog=1
on our master. But is this guess correct, or is there more that could be involved?SHOW FULL PROCESSLIST
. However, I'm wondering if there is a way to see what's the querie(s) related to that COMMIT statement. Otherwise, seems we have to guess what transaction is the root cause.Upvotes: 11
Views: 6209
Reputation: 2451
We finally were able to tackle this "commit" load issue.
A few facts about our system:
After investigating further, our findings are the followings:
For us, the most effective way to fix it is to batch writes together within a single transaction. Many of the writes we are doing do not need to be written to the DB immediately, and it may be OK to lose some of them (most of our writes are related to stats or tracking). Thus, our current solution is to store our stats/tracking writes into the cache and flush them to DB once in a while within a single transaction. This reduces the number of commits, effectively reducing the load.
This solution may not work for everyone or may need some tuning. From my understanding, there are some alternative solutions like:
I also answered my original questions below.
Hopefully, this can be helpful to somebody else. I found that there is very little information online related to this.
First, we have autocommit enabled. Does this "COMMIT" statement includes the automatic COMMIT statement from the autocommit configuration, or is it only exclusive to explicit COMMIT statements when our application is performing transactions?
This "COMMIT" statement includes queries performed with autocommit.
Also, in our context, none of the read queries were included in this "COMMIT" statement, most likely because no binlog flush to disk was required after read queries.
I checked more, and it seems this "init" state is related to the database flushing the binlog to disk (https://dev.mysql.com/doc/refman/8.0/en/general-thread-states.html). That could make sense since we have sync_binlog=1 on our master. But is this guess correct, or is there more that could be involved?
For us, only the binlog flush was causing the high load during this init state.
If that's the case, I read about tuning sync_binlog (e.g. sync_binlog=0, or sync_binlog>1), but with greater risk in case of hardware failure (https://aws.amazon.com/blogs/database/best-practices-for-configuring-parameters-for-amazon-rds-for-mysql-part-2-parameters-related-to-replication/). For this part, I am unclear what is a good practice when scaling RDS with one master and some read-replicas: is it a standard practice to touch this setting on the master, or should we focus more on reducing the amount of transactions we are performing from the application?
The standard practice is to keep sync_binlog=1
on the master to avoid recoverability problems after a crash, especially if you have read-replicas configured.
There are other ways to deal with this problem as listed above in my answer.
I was able to see these "COMMIT" statements stuck in init state when running SHOW FULL PROCESSLIST. However, I'm wondering if there is a way to see what's the querie(s) related to that COMMIT statement. Otherwise, seems we have to guess what transaction is the root cause.
It's doable as described in this blog post: https://www.psce.com/en/blog/2015/01/22/tracking-mysql-query-history-in-long-running-transactions/
However, it needs the events_statements_history to be enabled. On RDS this is disabled and don't allow to enable it as far as I know.
Upvotes: 16