mowwwalker
mowwwalker

Reputation: 17344

Variation on thread locking

I'm writing a web-scraping program which runs multiple requests at once. Some requests need to happen consecutively, though, so my first thought is to throw a lock around the two requests which need to happen together and a lock around the other requests. The problem with this approach is that the other two requests will lock on each other but are able to be run in parallel.

For example, I there are four pages I need data from on the website:

When the program starts, I kick off three threads: one for page1, one for page2 and page 3, and one for page4. The request for page3 MUST happen DIRECTLY after the request for page2. The requests for page1 and page4 can happen simultaneously.

If I don't using locking, page1 or page4 may be requested in between the requests for page2 and page3 and cause problems. If I use the same lock for the the three threads, then the request to page1 may block requests to page4.

What can I do to prevent a request from happening between page2 and page3, but allow other requests to happen simultaneously?

Upvotes: 1

Views: 69

Answers (2)

Michael Burr
Michael Burr

Reputation: 340218

You might try using a semaphore to control access to the 'page request' pseudo-resource. Requests that can run concurrently require only a single resource, requests that must run exclusively require all the resources.

Something like the following:

private static Semaphore _pool;
private static readonly int kMaxConcurrrentPageRequesters = 4;  // or whatever number

// at some appropriate initialization point

_pool = new Semaphore(kMaxConcurrrentPageRequesters,kMaxConcurrrentPageRequesters);


// when a normal request is being made that can run concurrently:

_pool.WaitOne();
perform_page_request();
_pool.Release();


// when an exclusive page request is being made:

// maybe create a Semaphore wrapper that stores the max semaphore count
//   so that you can expose a `WaitAll()` method to replace this loop
for (int i = 0; i < kMaxConcurrrentPageRequesters; ++i) {
    _pool.WaitOne();
}
perform_exclusive_page_requests();
_pool.Release(kMaxConcurrrentPageRequesters);

Upvotes: 1

Sinatr
Sinatr

Reputation: 21999

You could delay execution of threads or make them waiting. To example,

var waitA = false;
var thread1 = new Thread((Action)(() => { while(!waitA) Thread.Sleep(0); /* do work here */ })).Start();
var thread2 = new Thread((Action)(() => { while(!waitA) Thread.Sleep(0); /* do work here */ })).Start();
// prepare data for threads? and start them
waitA = true;

Upvotes: 0

Related Questions