TtT23
TtT23

Reputation: 7030

Efficient way to merge table like datastructure using linq

Allow me to present you with my atrocious logic first:

public void MergeLotDataList(List<SPCMeasureData> sPCMeasureDataList)
        {
            double standMaxTotal = 0.0;
            double standAimTotal = 0.0;
            double standMinTotal = 0.0;
            List<SPCLotData> lotDataRemovalList = new List<SPCLotData>();
            foreach (SPCLotData lotData in sPCLotDataList)
            {
                //Find if there's any lotDatas with duplicate identify strings
                var duplicateLotList = sPCLotDataList.Where(w => w.GetIdentifyString() == lotData.GetIdentifyString()).Select(s=>s);
                int duplicateLotCount = duplicateLotList.Count();
                if (duplicateLotCount <= 1)
                    continue;

                //Get the standMax,standAim,standMin total for computing average later
                //and remove duplicates, leaving only a single unique lotData
                foreach (SPCLotData lotData_inner in duplicateLotList)
                {
                    standMaxTotal += lotData_inner.GetStandMax();
                    standAimTotal += lotData_inner.GetStandAim();
                    standMinTotal += lotData_inner.GetStandMin());
                    if (lotData_inner != lotData)
                        lotDataRemovalList.Add(lotData_inner);
                }

                //Remove all duplicates
                foreach (SPCLotData lotDataToRemove in lotDataRemovalList)
                {
                    sPCLotDataList.Remove(lotDataToRemove);
                }
                lotDataRemovalList.Clear();

                //Set the corresponding standdatas to average
                lotData.SetStandData((standMaxTotal / duplicateLotCount),
                                     (standAimTotal / duplicateLotCount),
                                     (standMinTotal / duplicateLotCount);
                standMaxTotal = 0.0;
                standAimTotal = 0.0;
                standMinTotal = 0.0;
            }
        }

Now that I've ensured that my code makes zero sense to everyone (and of course, doesn't work either because i'm modifying the container inside the foreach loop), let me explain what I'm trying to do.

So I have a datastructure like this:

identifyString standMax standAim standMin
-----------------------------------------
     AA          3         4         5
     AA          1         2         3
     AA          1         2         4
     AB          0         5         7
     AC          3         4         5

The end result I'm trying to get is this:

identifyString standMax standAim standMin
-----------------------------------------
     AA          2.5      2.667      4
     AB          0         5         7
     AC          3         4         5

Notice how the duplicate rows (with same identifyString) have been removed, and the uniquely remaining row have the values (standMax,aim,min) is updated as their average.

What's the most elegant way of achieving this?

Upvotes: 1

Views: 89

Answers (3)

Vlad I.
Vlad I.

Reputation: 425

You can use LINQ Enumerable.ToLookup and Enumerable.Average extention methods

Here is the what I mean:

 var perIdentStrLookup = sPCMeasureDataList.ToLookup(k => k.GetIdentifyString());
 foreach(var lk in perIdentStrLookup)
 {
       Console.WriteLine("identifyString={0}; standMax={1}; standAim={2}; standMin={1}",
                           lk.Key,//identifyString
                           lk.Average(l=>GetStandMax()),
                           lk.Average(l=>GetStandAim()),
                           lk.Average(l=>GetStandMin()),

) }

or in case if you want unique list

  var uniqueList = sPCMeasureDataList
            .ToLookup(k => k.GetIdentifyString())
            .Select(lk => new SPCLotData 
            {
                   IdentifyString = lk.Key,
                   StandMax =  lk.Average(l=>GetStandMax()),
                   StandAim = lk.Average(l=>GetStandAim()),
                   StandMin = lk.Average(l=>GetStandMin())
            })
            .ToList()

Upvotes: 4

Squid
Squid

Reputation: 4820

I assume sPCLotDataList is where you get the data?

In which case you could:

var result = from x in sPCLotDataList
             group x by x.identifyString into grp
             select new { identifyString = grp.key
                          standMax = grp.Average(c => c.standMax)
                          standAim = grp.Average(c => c.standAim)
                          standMin= grp.Average(c => c.standMin)
                        }

Upvotes: 0

cuongle
cuongle

Reputation: 75316

You can use LINQ GroupBy:

var result = sPCLotDataList.GroupBy(x => x.identifyString)
                .Select(g => new SPCLotData(){
                        identifyString = g.Key,
                        standMax = g.Average(x => x.standMax),
                        standAim  = g.Average(x => x.standAim),
                        standMin = g.Average(x => x.standMin)
                    });

Upvotes: 3

Related Questions