vbNewbie
vbNewbie

Reputation: 3345

multithreading loop efficient? right?

I have the following multithreading function to implement threads fetching from a list of urls to parse content. The code was suggested by a user and I just want to know if this is an efficient way of implementing what I need to do. I am running the code now and getting errors on all functions that worked fine doing single thread. for example now for the list that I use to check visited urls; I am getting the 'argumentoutofrangeexception - capacity was less than the current size'/ Does everything now need to be synchronized?

        Dim startwatch As New Stopwatch
        Dim elapsedTime As Long = 0
        Dim urlCompleteList As String = String.Empty
        Dim numThread As Integer = 0
        Dim ThreadList As New List(Of Thread)

        startwatch.Start()
        For Each link In completeList
            Dim thread = New Thread(AddressOf processUrl)
            thread.Start(link)
            ThreadList.Add(thread)
        Next

        For Each Thread In ThreadList
            Thread.Join()
        Next

        startwatch.Stop()
        elapsedTime = startwatch.ElapsedMilliseconds


    End Sub
enter code here Public Sub processUrl(ByVal url As String)

        'make sure we never visited this before
        If Not VisitedPages.Contains(url) Then
            **VisitedPages.Add(url)**
            Dim startwatch As New Stopwatch
            Dim elapsedTime As Long = 0

Upvotes: 0

Views: 451

Answers (2)

Rob Goodwin
Rob Goodwin

Reputation: 2777

I am not seeing where VisitedPages is declared, but I do not see it local to the processUrl method. This would make is shared between all of the threads. This would cause a problem with multiple threads accessing the list/collection at the same time. Which would generate errors similar to what you describe. You will need to protect the VisitedPages collection with a mutex or something to guard against this.

Upvotes: 1

nos
nos

Reputation: 229088

If the VisitedPages within processUrl is shared among the threads, then yes, you need to assure only one thread can access that collection at a time - unless that collection itself is thread safe and takes care of that for you.

Same thing with any other data that that's shared among the threads you create.

Upvotes: 2

Related Questions