Reputation: 2899
I have a program that screen scrapes a handful of pages using HtmlAgilityPack and I want it to run faster, as the Load for the pages can take 1-2 seconds on occasion. Right now I have below code that does it sequentially.
List<Projections> projectionsList = new List<Projections>();
for (int i = 532; i <= 548; i++)
{
doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");
GetProjection(doc, ref projectionsList, (i));
}
Basically I wan to break out the code inside the loop to multiple threads and wait till all threads complete before executing. I would expect the List to be populated when complete. I realize Lists are not thread safe, but I am a little stuck on figuring out how to get around that.
Upvotes: 3
Views: 1512
Reputation: 1235
I suggest you to use a Parallel.For as the example of @Tgys, but you can use a ConcurrentBag collection, which is thread safe and you don't need to handle locks.
ConcurrentBag<Projections> projectionsList = new ConcurrentBag<Projections>();
Parallel.For(532, 548 + 1, i => {
var doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");
GetProjection(doc, ref projectionsList, (i));
}
});
Probably you need to change your GetProjection method. So check if my solution fit your needs.
See this link for more info about ConcurrentBag class.
Upvotes: 3
Reputation: 622
Use for example the Parallel for loop to create a basic concurrent loop. Then, whenever manipulating the list, make sure you lock the list first so other no two threads can manipulate it at once.
List<Projections> projectionsList = new List<Projections>();
Parallel.For(532, 548 + 1, i => {
var doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");
lock (projectionsList) {
GetProjection(doc, ref projectionsList, (i));
}
});
Upvotes: 0