How do I avoid two (or more) threads that work on a table at the same time to not work on same row?

Question

I am trying to make a C# WinForms application that fetches data from a url that is saved in a table named "Links". And each link has a "Last Checked" and "Next Check" datetime and there is "interval" which decides "next check" based on last check.

Right now, what I am doing is fetching ID with a query BEFORE doing the webscraping, and after that I turn Last Checked into DateTime.Now and Next Check into null untill all is completed. Which both then gets updated, after web scraping is done.

Problem with this is if there is any "abort" with an ongoing process, lastcheck will be a date, but nextcheck will be null.

So I need a better way for two processes to not work on same table's same row. But not sure how.

jurez · Accepted Answer

For a multithreaded solution, the standard engineering approach is to use a pool of workers and a pool of work.

This is just a conceptual sketch - you should adapt it to your circumstances:

A worker (i.e. a thread) looks at the pool of work. If there is some work available, it marks it as in_progress. This has to be done so that no two threads can take the same work. For example, you could use a lock in C# to do the query in a database, and to mark a row before returning it.
You need to have a way of un-marking it after the thread finishes. Successful or not, in_progress must be re-set. Typically, you could use a finally block so that you don't miss it in the event of any exception.
If there is no work available, the thread goes to sleep.
Whenever a new work arrives (i.e. INSERT, or a nextcheck is due), one of sleeping threads is awakened.
When your program starts, it should clear any in_progress flags in the event of a previous crash.
You should take advantage of DBMS transactions so that any changes a worker makes after completing its work are atomic - i.e. other threads percieve them as they had happened all at once.

By changing the size of worker pool, you can set the maximum number of simultaneously active workers.

How do I avoid two (or more) threads that work on a table at the same time to not work on same row?

Answers (2)

Related Questions