Reputation: 32933
I have a users
table with a datetime field last_seen_at
. Updating this field takes around 120ms, and i'd like it to be a lot quicker as i do it on pretty much every pageload on my site. I can't work out why it's so slow: there's around 55,000 records which shouldn't be problematically large (i'd have thought).
Here's the table info:
mysql> show table status like 'users'; +-------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ | Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment | +-------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ | users | InnoDB | 10 | Compact | 55609 | 954 | 53051392 | 0 | 43352064 | 26214400 | 67183 | 2015-09-22 13:12:13 | NULL | NULL | utf8_general_ci | NULL | | | +-------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ mysql> desc users; +---------------------------------+--------------+------+-----+-----------------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------------------------+--------------+------+-----+-----------------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | last_seen_at | datetime | YES | MUL | NULL | | +---------------------------------+--------------+------+-----+-----------------+----------------+ mysql> show indexes from users; +-------+------------+------------------------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+------------------------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | users | 0 | PRIMARY | 1 | id | A | 57609 | NULL | NULL | | BTREE | | | | users | 1 | index_users_on_last_seen_at | 1 | last_seen_at | A | 57609 | NULL | NULL | YES | BTREE | | | +-------+------------+------------------------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
As you can see i've got an index on the last_seen_at column already. I've ommitted all other columns (apart from id) for clarity's sake.
When i update last_seen_at i do it like so:
update users set last_seen_at = '2015-10-05 12:34:45' where id = 1182;
MySQL server info:
Server version: 5.5.44-0ubuntu0.12.04.1 (Ubuntu)
Is there anything i can do to speed up the update?
EDIT - i'd previously said the query was taking 700ms. it's actually more like 120ms, sorry, i was looking at the wrong query. This still feels a bit too long though. Is this actually a reasonable write time after all?
EDIT - all my timings come from manually entering sql queries in the mysql shell client. I do use MySQL in my Ruby on Rails web app, but that app is not involved for the purposes of this question: i'm purely looking at the database level.
Upvotes: 3
Views: 25359
Reputation: 4222
It's just really bad design to go and issue a db write on every page view, scales very badly. It is considered good style to not issue any writes during a GET request - and while you don't necessarily need to be religious over it, it is a very good practice for scaling.
If you absolutely need those timestamps, a simple way to do it would be to dump them in a key-value storage - memcached, redis, whatever - and write to db from time to time.
A super-simple way to increase your throughput would be to write updated values only if they differ from previous one by at least an hour (or a day) - that would guarantee every user will basically get one write per browsing session, cutting your writes 10-100 times depending on your site usage patterns.
Upvotes: 2
Reputation: 142298
If the record is wide and the table is busy, it may be best to move this column (plus id
) into a "parallel" table.
When updating a row, another copy of the row is made until the transaction (and possibly other transactions) are complete. This involves copying the entire record, possibly involving a block split. Plus there are issues with REDO log and UNDO log. And if you are using Replication, there is the binlog. A narrow row will lessen all these issue.
120ms sounds very high, so I guess a lot of other stuff is going on in this table. So, splitting the table may decrease contention.
Also, is this UPDATE
part of a bigger transaction? Or done outside a transaction but with autocommit=1? The latter makes more sense.
Upvotes: 1
Reputation: 24959
Write user events (id, now() equivalent to a log file). Process the log file from another process such as Create Event or entirely in another programming language such as Java, you name it. Let's call that the worker process (wp
).
So the user is operating in an environment where the activity occurs, but does not endure blocking overhead of the update call slowing his/her UX (user experience). Blocking means they wait. Rather, the activity is logged much quicker, such as an fwrite (language specific) to a log file.
The log file (Open for Append) concept can be deployed to a dedicated directory that either has all user activity in 1 file, or 1 file per user. In the latter case, the wp
has an easy task, just get the last line logged for the single update statement. For instance, if there are 11 lines in there, there is 1 update call, not 11.
The wp
runs in the background, in a cron job, Create Event, anything. It updates as necessary. With 55k users, this system is relatively small. Can fire once every nnn minutes, every 10 seconds, whatever.
As for a mysql Create Event
stub to contemplate:
CREATE EVENT userUpdateActivity
ON SCHEDULE
EVERY 10 SECOND
DO
(something)
or the some other wp
strategy.
The wp
processes and deletes the open for append log file. Locking and deletion strategy of the log file periodically (daily?) can be dreamt up.
The problem with a single log file is that the wp
one must either:
It is more difficult to clean up, delete that is, at the user-level
The benefit of a single log file is that it is self-contained and no directory searching is required.
Mysql Create Event manual page. One would still need to do a Load Data Infile to get to the data if done purely in mysql.
I would opt for a programming language that is well-suited for such logfile processing, such a java, c#, python, just about anything, rather than a clunky Create Event into a processing table.
The main takeaway here, though, is to make it asynchronous.
Upvotes: 2
Reputation: 4648
Well, you appear to be performing the update in the most efficient manner - i.e. using the primary key on the table, so there is not much that can be done there. Assuming the 120ms to update is purely the time taken by the db server (as opposed to the round trip in the web page), I can only think of a few things that might help:
You have indexed the column being updated - that typically adds a little time to the update as the index has to be maintained. I see that you need to use that column, so you can't get rid of the index; but if you could, you might well see better performance.
Batching updates is sometimes a good way of avoiding the real-time performance hit, but still achieving what you want.
You could have the web-triggered insert go into a holding table with a timestamp field, then (offline) batch update the real data. See https://dba.stackexchange.com/questions/28282/whats-the-most-efficient-way-to-batch-update-queries-in-mysql for an example batch update statement.
DB optimisation may help, but only if the db is not in good shape already - so things like memory allocation, tablespace fragmentation, buffer pools etc.
Good luck!
Upvotes: 7
Reputation: 156978
There is not much you can do about this. You already have an index on your column, and it just takes some time to find the row using the index and update it.
The index might be fragmented, which will slow down your lookup. You can rebuild the index using analyze
.
An option might be to delay the update
or to prevent it from blocking page building by using some asynchronous / background task in the programming environment you are using (aka, fire-and-forget).
Upvotes: 2