Reputation: 7278
I have a program that can be run in parallel i.e. multiple instances of the process can be run at the same time. Let us say it is an .exe
file called MyProgram.exe
.
Every run of this program creates a unique id in the database called RunId
. And every run has a status
. The status can be "Running"
, "Run"
and "Error"
. When the program starts running, the status is set to "Running" and eventually, the status will be updated with either "Run" if everything goes fine or "Error" when there was an error.
This works until the program exits unexpectedly. Say for example, when the server suddenly shuts down. In this case, the status of that specific run will stay "Running"
which is not true. And this causes confusion in the program.
The last time I did a quick research on this issue, somebody recommended a wise solution, which unfortunately may not work in my case:
The solution was to create a temp file somewhere with the id of this process when the process starts. And when it comes to an end ("Run", "Error") remove this file. This way, you can see if a file is still running by the next run of the program and then decide what to do.
The problem with this solution is when we see a temp file and the program is really still running! That means the program has no quit unexpectedly, but is taking a while to complete. So we don't want to terminate that.
To cut a long story short, how do we notify ourselves in a better way when our program exists unexpectedly?
Upvotes: 1
Views: 176
Reputation: 1048
I'd do it like this:
You add a second table where you log the RunID plus a DateTime. You update this table every second/ten seconds/minute (whatever time interval you desire) from your program as some kind of background task. By "update" I mean you either update the existing record or you add a new record, whatever you consider best.
A second process checks this Log table regularly for all RunId where the current status is "Running" in the main table, and whenever there isn't a "fresh" DateTime for a particular RunId (for instance you log every ten seconds and the latest in the Log table is from 20+ second ago) you update the "Running" state to "Crashed".
Upvotes: 0
Reputation: 172478
You could use a heartbeat: At least every x seconds, every "Running"
process updates the heartbeat
column of its row with the current server timestamp.
Thus, you have the following cases:
heartbeat >= current timestamp - x
: Process is running.heartbeat < current timestamp - x
: Process has crashed.Upvotes: 2