Nick
Nick

Reputation: 2907

Using Linq to sum up to a number (and skip the rest)

If we have a class that contains a number like this:

class Person 
{
  public string Name {get; set;}
  public int Amount {get; set;}
}

and then a collection of people:

IList<Person> people;

That contains, let's say 10 people of random names and amounts is there a Linq expression that will return me a subcollection of Person objects whose sum fulfills a condition?

For example I want the first x people whose sum of Amount is under 1000. I can do that traditionally by

 var subgroup = new List<Person>();

 people.OrderByDescending(x => x.Amount);

 var count = 0;
 foreach (var person in people)
 {
    count += person.Amount;
    if (count < requestedAmount)
    {
        subgroup.Add(person);
    }
    else  
    {
        break;
    }
 }

But i've been wondering if there's an elegant Linq way of doing something like this using Sum and then some other function like Take?

UPDATE

This is fantastic:

var count = 0;
var subgroup = people
                  .OrderByDescending(x => x.Amount)
                  .TakeWhile(x => (count += x.Amount) < requestedAmount)
                  .ToList();

But I am wondering if I can somehow change it further in order to grab the next person in the people list and add the remainder into the sum so that the total amount equals requested amount.

Upvotes: 47

Views: 6676

Answers (7)

Marlene
Marlene

Reputation: 59

Be careful when using Nicks answer in combination with an ORM, since it can change the objects value in the database too when wrapping a transaction around it, either purposefully or not. At least that is the case in our company. It has served us well though.

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72175

You can use TakeWhile:

int s = 0;
var subgroup  = people.OrderBy(x => x.Amount)
                      .TakeWhile(x => (s += x.Amount) < 1000)
                      .ToList();

Note: You mention in your post first x people. One could interpret this as the ones having the smallest amount that adds up until 1000 is reached. So, I used OrderBy. But you can substitute this with OrderByDescending if you want to start fetching from the person having the highest amount.


Edit:

To make it select one more item from the list you can use:

.TakeWhile(x => {
                   bool bExceeds = s > 1000;
                   s += x.Amount;                                 
                   return !bExceeds;
                })

The TakeWhile here examines the s value from the previous iteration, so it will take one more, just to be sure 1000 has been exceeded.

Upvotes: 48

Eric Lippert
Eric Lippert

Reputation: 660128

I dislike all answers to this question. They either mutate a variable in a query -- a bad practice that leads to unexpected results -- or in the case of Niklas's (otherwise good) solution, returns a sequence that is of the wrong type, or, in the case of Jeroen's answer, the code is correct but could be made to solve a more general problem.

I would improve the efforts of Niklas and Jeroen by making an actually generic solution that returns the right type:

public static IEnumerable<T> AggregatingTakeWhile<T, U>(
  this IEnumerable<T> items, 
  U first,
  Func<T, U, U> aggregator,
  Func<T, U, bool> predicate)
{
  U aggregate = first;
  foreach (var item in items)
  {
    aggregate = aggregator(item, aggregate);
    if (!predicate(item, aggregate))
      yield break;
    yield return item; 
  }
}

Which we can now use to implement a solution to the specific problem:

var subgroup = people
  .OrderByDescending(x => x.Amount)
  .AggregatingTakeWhile(
    0, 
    (item, count) => count + item.Amount, 
    (item, count) => count < requestedAmount)
  .ToList();

Upvotes: 17

Jeroen van Langen
Jeroen van Langen

Reputation: 22038

I took the comment of Eric Lippert and came with this better solution. I think the best way is to create a function (in my case I wrote an Extension Method)

public static IEnumerable<T> TakeWhileAdding<T>(
    this IEnumerable<T> source, 
    Func<T, int> selector, 
    Func<int, bool> comparer)
{
    int total = 0;

    foreach (var item in source)
    {
        total += selector(item);

        if (!comparer(total))
            yield break;

        yield return item;
    }
}

Usage:

var values = new Person[]
{
    new Person { Name = "Name1", Amount = 300 },
    new Person { Name = "Name2", Amount = 500 },
    new Person { Name = "Name3", Amount = 300 },
    new Person { Name = "Name4", Amount = 300 }
};

var subgroup = values.TakeWhileAdding(
    person => person.Amount, 
    total => total < requestedAmount);

foreach (var v in subgroup)
    Trace.WriteLine(v);

This could also be created for double, float, or something like a TimeSpan.

This way each time the subgroup is iterated, a new counter is used.

Upvotes: 3

NiklasJ
NiklasJ

Reputation: 580

I don't like these approaches of mutating state inside linq queries.

EDIT: I did not state that the my previous code was untested and was somewhat pseudo-y. I also missed the point that Aggregate actually eats the entire thing at once - as correctly pointed out it didn't work. The idea was right though, but we need an alternative to Aggreage.

It's a shame that LINQ don't have a running aggregate. I suggest the code from user2088029 in this post: How to compute a running sum of a series of ints in a Linq query?.

And then use this (which is tested and is what I intended):

var y = people.Scanl(new { item = (Person) null, Amount = 0 },
    (sofar, next) => new { 
        item = next, 
        Amount = sofar.Amount + next.Amount 
    } 
);       

Stolen code here for longevity:

public static IEnumerable<TResult> Scanl<T, TResult>(
    this IEnumerable<T> source,
    TResult first,
    Func<TResult, T, TResult> combine)
    {
        using (IEnumerator<T> data = source.GetEnumerator())
        {
            yield return first;

            while (data.MoveNext())
            {
                first = combine(first, data.Current);
                yield return first;
            }
        }
    }

Previous, wrong code:

I have another suggestion; begin with a list

people

[{"a", 100}, 
 {"b", 200}, 
 ... ]

Calculate the running totals:

people.Aggregate((sofar, next) => new {item = next, total = sofar.total + next.value})


[{item: {"a", 100}, total: 100}, 
 {item: {"b", 200}, total: 300},
 ... ]

Then use TakeWhile and Select to return to just items;

people
 .Aggregate((sofar, next) => new {item = next, total = sofar.total + next.value})
 .TakeWhile(x=>x.total<1000)
 .Select(x=>x.Item)

Upvotes: 24

Nick
Nick

Reputation: 2907

Giorgos pointed me to the right direction so his answer is the accepted one.

However for completeness I am writing here the solution that I ended up with.

var count = 0;
var exceeds = false;

var subgroup  = people.OrderBy(x => x.Amount).TakeWhile(x =>
{
    if (exceeds)
    {
        return false;
    }

    count += x.Amount;
    if (count >= requestedAmount)
    {
        x.Amount = requestedAmount - (count - x.Amount);
        exceeds = true;
        return true;
    }

    return !exceeds;
}).ToList();

This returns a subgroup whose total amount is equal to the requested amount. Thanks so much!

Upvotes: 1

Jeroen van Langen
Jeroen van Langen

Reputation: 22038

Try:

int sumCount = 0;

var subgroup = people
    .OrderByDescending(item => item.Amount)           // <-- you wanted to sort them?
    .Where(item => (sumCount += item.Amount) < requestedAmount)
    .ToList();

But it's not charming... It will be less readable.

Upvotes: 7

Related Questions