Isaac Levin
Isaac Levin

Reputation: 2899

Populating List from Multiple Threads

I have a program that screen scrapes a handful of pages using HtmlAgilityPack and I want it to run faster, as the Load for the pages can take 1-2 seconds on occasion. Right now I have below code that does it sequentially.

  List<Projections> projectionsList = new List<Projections>();       
  for (int i = 532; i <= 548; i++)
            {
                doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");
                GetProjection(doc, ref projectionsList, (i));
            }

Basically I wan to break out the code inside the loop to multiple threads and wait till all threads complete before executing. I would expect the List to be populated when complete. I realize Lists are not thread safe, but I am a little stuck on figuring out how to get around that.

Upvotes: 3

Views: 1512

Answers (2)

Hernan Guzman
Hernan Guzman

Reputation: 1235

I suggest you to use a Parallel.For as the example of @Tgys, but you can use a ConcurrentBag collection, which is thread safe and you don't need to handle locks.

ConcurrentBag<Projections> projectionsList = new ConcurrentBag<Projections>();       
Parallel.For(532, 548 + 1, i => {
    var doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");    
    GetProjection(doc, ref projectionsList, (i));
    }
});

Probably you need to change your GetProjection method. So check if my solution fit your needs.

See this link for more info about ConcurrentBag class.

Upvotes: 3

Tgys
Tgys

Reputation: 622

Use for example the Parallel for loop to create a basic concurrent loop. Then, whenever manipulating the list, make sure you lock the list first so other no two threads can manipulate it at once.

List<Projections> projectionsList = new List<Projections>();       
Parallel.For(532, 548 + 1, i => {
    var doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");
    lock (projectionsList) {
        GetProjection(doc, ref projectionsList, (i));
    }
});

Upvotes: 0

Related Questions