Modship
Modship

Reputation: 57

Thread safety Parallel.For c#

im frenchi so sorry first sorry for my english .

I have an error on visual studio (index out of range) i have this problem only with a Parallel.For not with classic for.

I think one thread want acces on my array[i] and another thread want too ..

It's a code for calcul Kmeans clustering for building link between document (with cosine similarity).

more information :

My function is :

    private static int FindClosestClusterCenter(List<Centroid> clustercenter, DocumentVector obj)
{
    float[] similarityMeasure = new float[clustercenter.Count()];
    float[] copy = similarityMeasure;
    object sync = new Object();

  Parallel.For(0, clustercenter.Count(), (i) =>      //for(int i = 0; i < clustercenter.Count(); i++)  Parallel.For(0, clustercenter.Count(), (i) =>  //
       {
                similarityMeasure[i] = SimilarityMatrics.FindCosineSimilarity(clustercenter[i].GroupedDocument[0].VectorSpace, obj.VectorSpace);

       });

    int index = 0;
    float maxValue = similarityMeasure[0];
    for (int i = 0; i < similarityMeasure.Count(); i++)
    {
        if (similarityMeasure[i] > maxValue)
        {
            maxValue = similarityMeasure[i];
            index = i;
        }

    }
    return index;
}

My function is call here :

do
            {
                prevClusterCenter = centroidCollection;
                DateTime starttime = DateTime.Now;

                  foreach (DocumentVector obj in documentCollection)//Parallel.ForEach(documentCollection, parallelOptions, obj =>//foreach (DocumentVector obj in documentCollection)
                   {

                       int ind = FindClosestClusterCenter(centroidCollection, obj);

                       resultSet[ind].GroupedDocument.Add(obj);

                   }
                TimeSpan tempsecoule = DateTime.Now.Subtract(starttime);
                Console.WriteLine(tempsecoule);
                //Console.ReadKey();
                InitializeClusterCentroid(out centroidCollection, centroidCollection.Count());
                centroidCollection = CalculMeanPoints(resultSet);
                stoppingCriteria = CheckStoppingCriteria(prevClusterCenter, centroidCollection);
                if (!stoppingCriteria)
                {
                    //initialisation du resultat pour la prochaine itération
                    InitializeClusterCentroid(out resultSet, centroidCollection.Count);
                }
            } while (stoppingCriteria == false);
            _counter = counter;
            return resultSet;

FindCosSimilarity :

 public static float FindCosineSimilarity(float[] vecA, float[] vecB)
        {
            var dotProduct = DotProduct(vecA, vecB);
            var magnitudeOfA = Magnitude(vecA);
            var magnitudeOfB = Magnitude(vecB);
            float result = dotProduct / (float)Math.Pow((magnitudeOfA * magnitudeOfB),2);
            //when 0 is divided by 0 it shows result NaN so return 0 in such case.
            if (float.IsNaN(result))
                return 0;
            else
                return (float)result;

        }

CalculMeansPoint :

 private static List<Centroid> CalculMeanPoints(List<Centroid> _clust)
        {
            for (int i = 0; i < _clust.Count(); i++)
            {
                if (_clust[i].GroupedDocument.Count() > 0)
                {
                    for (int j = 0; j < _clust[i].GroupedDocument[0].VectorSpace.Count(); j++)
                    {
                        float total = 0;
                        foreach (DocumentVector vspace in _clust[i].GroupedDocument)
                        {
                            total += vspace.VectorSpace[j];
                        }

                        _clust[i].GroupedDocument[0].VectorSpace[j] = total / _clust[i].GroupedDocument.Count();
                    }
                }
            }
            return _clust;
        }

Upvotes: 2

Views: 330

Answers (2)

weston
weston

Reputation: 54781

You may have some side effects in FindCosineSimilarity, make sure it does not modify any field or input parameter. Example: resultSet[ind].GroupedDocument.Add(obj);. If resultSet is not a reference to locally instantiated array, then that is a side effect.

That may fix it. But FYI you could use AsParallel for this rather than Parallel.For:

similarityMeasure = clustercenter
      .AsParallel().AsOrdered()
      .Select(c=> SimilarityMatrics.FindCosineSimilarity(c.GroupedDocument[0].VectorSpace, obj.VectorSpace))
      .ToArray();

Upvotes: 1

Tobias Maslowski
Tobias Maslowski

Reputation: 41

You realize that if you synchronize the whole Content of the Parallel-For, it's just the same as having a normal synchrone for-loop, right? Meaning the code as it is doesnt do anything in parallel, so I dont think you'll have any Problems with concurrency. My guess from what I can tell is clustercenter[i].GroupedDocument is propably an empty Array.

Upvotes: 0

Related Questions