maki
maki

Reputation: 621

when does innodb deliver updates to row data in buffer and disk?

I have a question about when does innodb update row data in the buffer and when does the change go to the disk. This question comes from the reading to undo log which says the history data are in the undo log waiting for rollbacks. If the engine needs undo log for rollback, changes of an update query must have changed the row before commit? And then what does the commit do since the data have already been updated.

Upvotes: 0

Views: 116

Answers (1)

Rick James
Rick James

Reputation: 142208

When you INSERT, UPDATE, or DELETE a row:

Quick summary:

  1. Fetch the block containing the row (or the block that should contain the row). 2. Insert/update/delete the row.
  2. Mark the block as "dirty". It will eventually be written to disk.
  3. Put non-unique secondary index changes in the "change buffer"

More details (on those steps):

  1. To find the 16KB block, drill down the PRIMARY KEY's BTree. If the block is not in the buffer_pool (which is allocated in RAM), fetch it from disk. (This may involve bumping some other block out of the buffer_pool.
  2. Copy the previous value (in case of Update/Delete) to the undo log, and prep it for flushing to disk.
  3. A background task flushes dirty pages to disk. If all is going smoothly, 'most' of the buffer_pool contains non-dirty pages, and you 'never' have to wait for a 'free' block in the buffer_pool.
  4. The Change Buffer is sort of a "delayed write" for index updates. It is transparent. That is, subsequent index lookups will automagically look in the change buffer and/or the index's BTree. The data in the CB will eventually be blended with the real index BTree and eventually flushed to disk.

UNIQUE keys: All INSERTs and UPDATEs that change the Unique key's column(s) necessarily check for dup-key rather than going through the change buffer.

AUTO_INCREMENT has some other special actions.

Depending on the values of innodb_flush_log_at_trx_commit and innodb_doublewrite something may be flushed to disk at the end of the transaction. These handle "atomic" transactions and "torn pages".

Replication: Other activity may include writing to and syncing the binlog, and pushing data to other nodes in the cluster.

The design is "optimistic" in that it is optimized for COMMIT at the expense of ROLLBACK. After a Commit, a process runs around purging the copies that were kept in case of a crash and Rollback. A Rollback is more complex in that it must put back the old copies of the rows. (See also "history list".)

Search for some of the keywords I have mentioned; read some other web pages; then come back with a more specific question.

Commit

Let's look at it from a different side. Each row, including non-yet-committed rows being changed/deleted, has a "transaction id". All the rows for a given transaction have the same id. So, even if there is a crash, InnoDB, knows what to cleanup. COMMIT and ROLLBACK need to be 'atomic'; this is aided by having a single write to disk "says it all". The only way for that to be possible is for the transaction id to be the key. Keep in mind, there could be a million rows scattered around the buffer_pool and data files and logs waiting for the commit/rollback.

After the commit/rollback, InnoDB can leisurely run around cleaning up things. For example, until a UPDATE is committed or rolled back, there are two copies of each row being changed. One of the rows needs to be removed -- eventually. Meanwhile, the two rows are on a "history list". Any other transactions search through the history list to see which one row they are allowed to see -- READ UNCOMMITTED = latest row that has not been committed / rolled back; READ COMMITTED = latest row that has been committed / rolled back; etc.

If I understand it correctly, the undo log is an optimization. For example, on a DELETE the "old values" of the rows are copied to the undo log, and the row is actually deleted from the data BTree. The optimization here is that the undo log is serially written, while the BTree may involve a lot more blocks, scattered around the table. Also, the normal processing of data blocks includes caching them in the buffer_pool. For Commit, the records in the undo log are tossed. For Rollback, there is the tedious effort of using the undo log for reconstruction.

Yes, the history list adds work for all other transactions touching your recently changed rows. But it enables transaction-isolation-modes and aids in recovery from crashes.

Upvotes: 1

Related Questions