Farzher
Farzher

Reputation: 14583

MYSQL Making sure two cronjobs don't process the same request

Here's my problem:

In my database I have a requests table, which has a status field. A cronjob runs every minute, looking for every request with a "pending" status. WHen it's done processing the request it sets the status to "finished".

Sometimes this cronjob takes over a minute to run, so two of the same cronjob are running. I need to make sure they don't process the same request.

Right now I'm doing this:

$requests = db_all($this->db->query('SELECT * FROM x__mx_request WHERE status="pending"'));
//few lines of code later
$this->db->query('UPDATE x__mx_request SET status="processing" WHERE id IN('.implode(',', $ids).')');

I set the status to processing after I select it, which stops other scripts from processing it too, but Isn't it possible that one script selects all requests, then a 2nd script selects all requests before the 1st script sets them to processing?

I'd like a more secure way of doing this!

EDIT: Thanks for the great answers, I'll have to actually try them out though before I can mark one as correct.

Upvotes: 1

Views: 1117

Answers (3)

Mark Brackett
Mark Brackett

Reputation: 85665

Set the status to a unique value, and then process everything with that unique value. Have a timeout or some sort of fallback if your processing fails half-way through (but you need that with your current situation anyway).

Something like (and I'm just making up PHP, since I don't particularly know the language):

 $guid = new_guid();

 $this->db->query(
    'UPDATE x__mx_request SET status = ? WHERE status = "pending";', 
    $guid
 );
 $requests = db_all(
    $this->db->query('SELECT * FROM x__mx_request WHERE status = ?;', $guid)
 );

Another option is transactions - but I think you'd need SERIALIZABLE, which means you're basically stuck with only 1 job processing anyway. If you want to do that, a lock file for your cron job makes that easy to do without changing code.

Upvotes: 2

imichaelmiers
imichaelmiers

Reputation: 3519

transactions are what you want. So you have a single transaction that reads the data, changes the pending status to some intermediate value such as inprocess, and returns the data.

This prevents two queries from returning the same data.

Roughly you want something like this pseudo sql

begin transaction 

select into tmp table * from blah where status = "pending"

update status = "being processed" where id in tmp 

end transaction 

return select * from tmp table

Upvotes: 0

Gntem
Gntem

Reputation: 7165

you can try to make a lock/unlock file so cronjobs can check to "know" if a job is/isn't already running. also process pid could be used for identifying status of process in case something failed, not responding etc..

Upvotes: 2

Related Questions