Reputation: 21
I have a dictionary with a string key and integer value. The value represents the number of occurrences of the key.
How do I create a new dictionary with the keys and values representing the top 25% of values? The sum of the values should be equal to or greater than the sum of all values. For example, if my dictionary contains 5 items with values (5, 3, 2, 1, 1) and I want the top 50%, the new dictionary would contain values (5, 3) because their sum is 8 and that is >= 50% of 12. This dictionary needs to be sorted descending by value and then the top N taken such that their sum meets the specified percentage.
This code gives me the top N but is based on a known count. How do I take into account the desired percentage?
var topItemsCount = dictionary.OrderByDescending(entry => entry.Value)
.Take(topN)
.ToDictionary(pair => pair.Key, pair => pair.Value);
Upvotes: 2
Views: 1353
Reputation: 13954
Rephrasing the question, into two parts:
Question 1 would look like
double percent = inputValue;
double n = dictionary.Values.Sum() * percent;
Question 2 would look like:
Dictionary<string, int> newValues = dictionary.OrderByDescending(_ => _.Value)
.Aggregate(
new {sum = 0.0, values = new Dictionary<string, int>()},
(sumValues, kv) =>
{
if (sumValues.sum <= n)
sumValues.values.Add(kv.Key, kv.Value);
return new {sum = sumValues.sum + kv.Value, values = sumValues.values};
},
sumValues => sumValues.values);
You could also use a for loop and a running sum, but for running totals with limited scope, I like the compactness of the Aggregate function. The downside to this is that the entire source Dictionary is still iterated. A custom iterator method would get around this. For example:
public static class Extensions
{
public static IEnumerable<TThis> TakeGreaterThan<TThis>(this IEnumerable<TThis> source, Func<TThis, double> valueFunc, double compareTo)
{
double sum = 0.0;
IEnumerable<TThis> orderedSource = source.OrderByDescending(valueFunc);
var enumerator = orderedSource.GetEnumerator();
while (sum <= compareTo && enumerator.MoveNext())
{
yield return enumerator.Current;
sum += valueFunc(enumerator.Current);
}
}
}
Used as
Dictionary<string, int> newValues = dictionary.TakeGreaterThan(_ => _.Value, n).ToDictionary(_ => _.Key, _ => _.Value);
Upvotes: 1
Reputation: 6608
Something like:
var topItemsCount = dictionary.OrderByDescending(entry => entry.Value)
.Take(Math.Floor(dictionary.Count * 0.25))
.ToDictionary(pair => pair.Key, pair => pair.Value);
Running .Count on a dictionary returns the number of key-value pairs in the collection. Taking Math.Floor rounds it down to the nearest int.
Edited to reflect comments
I would probably just use a simple non-linq solution to achieve what you want. Maybe more verbose, but it's pretty clear to anyone what it does:
var total = dictionary.Sum(e => e.Value);
var cutoff = total * 0.5;
var sum = 0;
var pairs = new List<KeyValuePair<string, int>>();
foreach (var pair in dictionary.OrderByDescending(e => e.Value))
{
sum += pair.Value;
pairs.Add(pair);
if (sum > cutoff)
break;
}
dictionary = pairs.ToDictionary(pair => pair.Key, pair => pair.Value);
One more edit
If you really want more linq, you could try holding an accumulated class level variable.
private static int sum = 0;
static void Main(string[] args)
{
var dictionary = new Dictionary<string, int>()
{
{"1",5},
{"2",3},
{"3",2},
{"4",1},
{"5",1},
};
var total = dictionary.Sum(e => e.Value);
var cutoff = total * 0.5;
var filtered = dictionary.OrderByDescending(e => e.Value)
.TakeWhile(e => Add(e.Value).Item1 < cutoff)
.ToDictionary(pair => pair.Key, pair => pair.Value);
}
private static Tuple<int, int> Add(int x)
{
return Tuple.Create(sum, sum += x);
}
It's a bit convoluted with the add function returning a tuple because you are including the first value that breaches the cut off in the result (i.e. even if 5 + 3 = 8 is greater than the cut off 6, you still include 3).
Upvotes: 1
Reputation: 73482
May be this?
var dictionary = new Dictionary<string, int>()
{
{"1",5},
{"2",3},
{"3",2},
{"4",1},
{"5",1},
};
var max = dictionary.Values.Max();
int percent = 50;
int percentageValue = max*percent /100;
var topItems = dictionary.OrderByDescending(entry => entry.Value)
.TakeWhile(x => x.Value > percentageValue)
.ToDictionary(pair => pair.Key, pair => pair.Value);
foreach (var item in topItems)
{
Console.WriteLine(item.Value);
}
Outputs:
5
3
Upvotes: 0