Reputation: 19582

How can my process/script avoid to be stuck in an infinite loop?

I have a simple algorithm as follows omitting irrelevant things for clarity:

while( my $data = get_data() ) {  
   process($data);    
}  

sub get_data() {  
   return $dbh->execute("SELECT * FROM TABLE WHERE status = 'submitted'");  
}  

sub process {  
   my $data = shift;  
   foreach my $row (@$data) {  
      handle($row);  
   }  
}  

sub handle {  
   my $row = shift;  
   # process logic here  
   $dbh->execute("UPDATE TABLE SET status='done'" WHERE id = $row->id");   
}

Now the problem is that there is a bug that leads to an infinite loop. If during the handling and the core processing logic there is a failure and the code does not manage to update to status done then in the next loop it will retry the records over and over.
Adding a new status I think just moves the same problem elsewhere so I think that there is some kind of flaw in my logic.
Is there a way for a process to "understand" that it is stuck in a loop and does not progress? Or to rephrase as per @amit: Is there a way to avoid the infinite loop?
I could only come up with the following:
1) When I do the get_data() I add the rows to an in memory array so in the next loop I would do SELECT * FROM TABLE WHERE status = 'submitted' WHERE id NOT IN (@array)
This would work but for a large number of records e.g. 50000 it seems impractical to place them in an IN.
2) Try to see how many records where updated. If the sizeof($data) != number of updated records it could mean a stuck in the process.

Neither of these approaches seem nice to me. Can anyone please help me figure this out?

Upvotes: 0

Answers (2)

zachhilbert

Reputation: 639

If you don't want to utilize an in-memory array (which I don't blame you for), are you able to modify your existing table or create another one? If so, you could move successful updated records to the new table, even if it's only 2 columns, and then:

sub get_data() {  
   return $dbh->execute("SELECT * FROM TABLE WHERE ID NOT IN (SELECT ID FROM NEW_TABLE)");  
}

It's difficult when dealing with this many records to maintain an in memory hash/array, especially if your application crashes for some reason. At least this way you are making progress.

I don't see a different way in the programming logic to handle this.

Upvotes: 1

Borodin

Reputation: 126742

The code that you show won't compile because of this line

$dbh->execute("UPDATE TABLE SET status='done'" WHERE id = $row->id");

which has too many double-quote characters to balance.

I think you are confusing yourself with so many subroutines to perform a simple task. I believe this short program does the same thing, and I think it is more readable.

Note that it is best practice to prepare each statement and to use placeholders in the prepared SQL string so that a subsequent execute can pass the actual data.

I haven't been able to test this code, as it would take a while to set up a sample database; but I have checked that it compiles.

use strict;
use warnings;

use DBI;

my $dbh = DBI->connect('dsn', 'username', 'password');

my $select   = $dbh->prepare(q{SELECT * FROM TABLE WHERE status = 'submitted'});
my $set_done = $dbh->prepare(q{UPDATE TABLE SET status = 'done' WHERE id = ?});

$select->execute;

while (my $row = $select->fetchrow_hashref) {  
  $set_done->execute($row->{id});
}

Once you get this running there is a couple of problems that occur to me

If all you want from TABLE is id then you shouldn't be using SELECT *
You could write just UPDATE TABLE SET status = 'done' WHERE status = 'submitted'

But maybe your real situation is more complex? Please let us know so that we can help you better.

Upvotes: 0

How can my process/script avoid to be stuck in an infinite loop?

Answers (2)

Related Questions