John Bustos
John Bustos

Reputation: 19544

Linq - Group then compare elements within each group

Suppose, for example, in my C# code, I have MyClass, defined as:

public class MyClass
{
    public string GroupName;
    public DateTime Dt;
    public int Id;
    public string Val;
    .... other properties ....
}

And suppose I had the following List<MyClass> (showing it as a table since it seems the easiest way to describe the contents):

GroupName:       Dt:             Id:        Val:
Group1           2016/01/01      1          Val1
Group1           2016/01/02      1          Val1
Group1           2016/01/03      1          Val1
Group1           2016/01/04      1          Val2
Group1           2016/01/05      1          Val3
Group1           2016/01/06      1          Val1
Group1           2016/01/07      1          Val1
Group1           2016/01/08      1          Val4
Group1           2016/01/09      1          Val4

With, obviously, the same kind of thing occurring for multiple GroupNames and different Ids.

What I would like to get from this list is, for any named group, each first changed value - So the output for Group1 would be:

Dt:             Id:        Val:
2016/01/01      1          Val1
2016/01/04      1          Val2
2016/01/05      1          Val3
2016/01/06      1          Val1
2016/01/08      1          Val4

In other words, for a given GroupName:

  1. Group by Id
  2. Order by Date
  3. Select any item within each group where item[index] != item[index-1]

So, I got the following code working:

public IEnumerable<MyClass> GetUpdatedVals(List<MyClass> myVals, string groupName)
{
    var filteredVals = myVals.Where(v => v.GroupName == groupName).ToList();

    return filteredVals
        .OrderBy(v => v.Id)
        .ThenBy(v => v.Dt)
        .Where((v, idx) => idx == 0 || v.Id != filteredVals[idx - 1].Id || v.Val != filteredVals[idx - 1].Val)
        .Select(v => v);
}

But it seems like there should be a better way to do this via Linq using GroupBy or something not having to create a separate holding list.

Any ideas? Or is this a "perfectly good" / the best way?

Thanks!

Upvotes: 3

Views: 3114

Answers (2)

gnalck
gnalck

Reputation: 972

If you want something a bit more elegant, you can use the GroupAdjacent by function described at https://stackoverflow.com/a/4682163/6137718 :

public static class LinqExtensions
{
    public static IEnumerable<IEnumerable<T>> GroupAdjacentBy<T>(
        this IEnumerable<T> source, Func<T, T, bool> predicate)
    {
        using (var e = source.GetEnumerator())
        {
            if (e.MoveNext())
            {
                var list = new List<T> { e.Current };
                var pred = e.Current;
                while (e.MoveNext())
                {
                    if (predicate(pred, e.Current))
                    {
                        list.Add(e.Current);
                    }
                    else
                    {
                        yield return list;
                        list = new List<T> { e.Current };
                    }
                    pred = e.Current;
                }
                yield return list;
            }
        }
    }
}

We can use this to group all adjacent elements that have the same Val, after sorting by Id and Dt. Then from each group, we select the first one, as that represents the most recent change. The updated code would look something like this:

public IEnumerable<MyClass> GetUpdatedVals(List<MyClass> myVals, string groupName)
{
    return myVals
        .Where(v => v.GroupName == groupName)
        .OrderBy(v => v.Id)
        .ThenBy(v => v.Dt)
        .GroupAdjacentBy((x, y) => x.Val == y.Val && x.Id == y.Id)
        .Select(g => g.First());
}

Upvotes: 3

Tim Schmelter
Tim Schmelter

Reputation: 460038

If i understand your requirement and your working code correctly you want to get all changes. Since you already order by the ID you can use GroupBy to get the ID-Groups. Now you need to add all per ID-Group where the Val-value changes from one object to the other. You could use following single query which creates list of each group to access the previous element via index and SelectMany to flatten them.

public IEnumerable<MyClass> GetUpdatedVals(List<MyClass> myVals, string groupName)
{
    return myVals
        .Where(v => v.GroupName == groupName)
        .OrderBy(v => v.Id)
        .ThenBy(v => v.Dt)
        .GroupBy(v => v.Id)
        .Select(g => g.ToList())
        .SelectMany(gList => gList
            .Where((v, idx) => idx == 0 || v.Val != gList[idx - 1].Val));
}

Upvotes: 1

Related Questions