alexander.skibin
alexander.skibin

Reputation: 133

c# - Summarizing duplicate rows in datatable

I have a table and I want to sum up duplicate rows:

|name  | n |   |name  | n |
|------+---|   |------+---|
|leo   | 1 |   |leo   | 3 |
|wayne | 1 |   |wayne | 2 |
|joe   | 1 |   |joe   | 1 |
|wayne | 1 |
|leo   | 1 |
|leo   | 1 |

I can delete it like this, but how to summarize?

        ArrayList UniqueRecords = new ArrayList();
        ArrayList DuplicateRecords = new ArrayList();

        foreach (DataRow dRow in table.Rows)
        {
            if (UniqueRecords.Contains(dRow["name"]))
                DuplicateRecords.Add(dRow);
            else
                UniqueRecords.Add(dRow["name"]);
        }

        foreach (DataRow dRow in DuplicateRecords)
        {
            table.Rows.Remove(dRow);
        }

Upvotes: 1

Views: 1146

Answers (2)

alexander.skibin
alexander.skibin

Reputation: 133

Thanks for your replies, another variant:

var result = from row in table.AsEnumerable()
                             group row by row.Field<string>("Name") into grp
                             select new
                             {
                                 name = grp.Key,
                                 n = grp.Count()
                             };

Upvotes: 0

Christian
Christian

Reputation: 4375

This is how you do it with a dictionary. Basically you create a dictionary from "name" to DataRow object and then sum up the DataRows' "n" property:

// create intermediate dictionary to group the records
Dictionary<string, DataRow> SummarizedRecords = new Dictionary<string, DataRow>();

// iterate over all records 
foreach(DataRow dRow in table.Rows)
{
  // if the record is in the dictionary already -> sum the "n" value
  if(SummarizedRecords.ContainsKey(dRow["name"]))
  {
    SummarizedRecords[dRow["name"]].n += dRow["n"];
  }
  else
  {
    // otherwise just add the element
    SummarizedRecords[dRow["name"]] = dRow;
  }
}

// transform the dictionary back into a list for further usage
ArrayList<DataRow> summarizedList = SummarizedRecords.Values.ToList();

I think this can be done more elegantly (1 line of code) with LINQ. Let me think some more about it :)

Edit

Here is a Linq version, which however involves creating new DataRow objects, this may not be your intention - don't know:

ArrayList<DataRow> summarizedRecords = table.Rows.GroupBy(row => row["name"]) // this line groups the records by "name"
              .Select(group => 
                      {
                        int sum = group.Sum(item => item["n"]);  // this line sums the "n"'s of the group
                        DataRow newRow = new DataRow();  // create a new DataRow object
                        newRow["name"] = group.Key;      // set the "name" (key of the group)
                        newRow["n"] = sum;               // set the "n" to sum
                        return newRow;                   // return that new DataRow
                      })
              .ToList();     // make the resulting enumerable a list

Upvotes: 2

Related Questions