Reputation: 730
This kind of thing has been done a million times I'm sure, but my search foo appears weak today, and I'd like to get opinions on what is generally considered the best way to accomplish this goal.
My application keeps track of sessions for online users in a system. Each session corresponds to a single record in a database. A session can be ended in one of two ways. Either a "stop" message is received, or the session can timeout. The former case is easy, it is handled in the message processing thread and everything is fine. The latter case is where the concern comes from.
In order to process timeouts, each record has an ending time column that is updated each time a message is received for that session. To make timeouts work, I have a thread that returns all records from the database whose endtime < NOW() (has an end time in the past), and goes through the processing to close those sessions. The problem here is that it's possible that I might receive a message for a session while the timeout thread is going through processing for the same session. I end up with a race between the timeout thread and message processing thread.
I could use a semaphore or the like and just prevent the message thread from processing while timeout is taking place as it only needs to run every 30 seconds or a minute. However, as the user table gets large this is going to run into some serious performance issues. What I think I would like is a way to know in the message thread that this record is currently being processed by the timeout thread. If I could achieve that I could either discard the message or wait for timeout thread to end but only in the case of conflicts now instead of always.
Currently my application uses JDBC directly. Would there be an easier/standard method for solving this issue if I used a framework such as Hibernate?
Upvotes: 1
Views: 1731
Reputation: 29649
This is a great opportunity for all kinds of crazy bugs to occur, and some of the cures can cause performance issues.
The classic solution would be to use transactions (http://dev.mysql.com/doc/refman/5.0/en/commit.html). This allows you to guarantee the consistency of your data - but a long-running transaction on the database turns it into a huge bottleneck; if your "find timed-out sessions" code runs for a minute, the transaction may run for that entire period, effectively locking write access to the affected table(s). Most systems would not deal well with this.
My favoured solution for this kind of situation is to have a "state machine" for status; I like to implement this as a history table, but that does tend to lead to a rapidly growing database.
You define the states of a session as "initiated", "running", "timed-out - closing", "timed-out - closed", and "stopped by user" (for example).
You implement code which honours the state transition logic in whatever data access logic you've got. The pseudo code for your "clean-up" script might then be:
All other attempts to modify the current state of the session record must check that the current status is valid for the attempted change.
For instance, the "manual" stop code should be something like this:
update sessions
set status = "stopped by user"
where session_id = xxxxx
and status = 'running'
If the auto-close routine has kicked off in the time between showing the user interface and the database code, the where clause won't match any records, so the rest of the code simply doesn't run.
For this to work, all code that modifies the session status must check its pre-conditions; the most maintainable way is to encode status and allowed transitions into a separate database table.
You could also write triggers to enforce this logic, though I'm normally not a fan of triggers - only do this if you have to.
I don't think this adds significant performance worries - but test and optimize. The majority of the extra work on the database is by adding extra "where" clauses to your update statements; assuming you have an index on status, it's unlikely to have a measurable impact.
Upvotes: 2