Reputation: 15185
I am trying to figure out the best way to calculate a running sum partition with a self joined collection using LINQ.
The query below is a somewhat simple example of what I am after. The output is the RowNumber, the RowType and the sum of all preceding RowValues within the current row's RowType.
DECLARE @T TABLE (RowNumber INT, RowType INT, RowValue INT)
INSERT @T VALUES (1,1,1),(2,1,1),(3,1,1),(4,1,1),(5,1,1),(6,2,1),(7,2,1),(8,2,1),(9,2,1),(10,2,1)
;WITH Data AS(SELECT RowNumber, RowType,RowValue FROM @T)
SELECT
This.RowNumber,
This.RowType,
RunningValue = COALESCE(This.RowValue + SUM(Prior.RowValue),This.RowValue)
FROM
Data This
LEFT OUTER JOIN Data Prior ON Prior.RowNumber < This.RowNumber AND Prior.RowType = This.RowType
GROUP BY
This.RowNumber,
This.RowType,
This.RowValue
/* OR
SELECT
This.RowNumber,
This.RowType,
RunningValue = SUM(RowValue) OVER (PARTITION BY RowType ORDER BY RowNUmber)
FROM
Data This
*/
Now, my not working attempt.
var joinedWithPreviousSums = allRows.Join(
allRows,
previousRows => new {previousRows.RowNumber, previousRows.RowType, previousRows.RowValue},
row=> new { row.RowNumber, row.RowType, row.RowValue},
(previousRows, row) => new { row.RowNumber, row.RowType, row.RowValue })
.Where(previousRows.RowType == row.RowType && previousRows.RowNumber < row.RowNumber)
.Select(row.RowNumber, row.RowType,RunningValue = Sum(previousRows.Value) + row.RowValue)).ToList()
Of course, the last two lines above are garbage and attempt to exemplify my desired projection while hinting at my lack of knowledge on performant complex LINQ projections.
I have read where some variation of the statement below could work and may be workable, however, is there a way to achieve similar results without yielding?
int s = 0;
var subgroup = people.OrderBy(x => x.Amount)
.TakeWhile(x => (s += x.Amount) < 1000)
.ToList();
EDIT : I have been able to get the snippet below to work, however, I cant seem to partition or project over RowType.
namespace ConsoleApplication1
{
class Program
{
delegate string CreateGroupingDelegate(int i);
static void Main(string[] args)
{
List <TestClass> list = new List<TestClass>()
{
new TestClass(1, 1, 1),
new TestClass(2, 2, 5),
new TestClass(3, 1, 1 ),
new TestClass(4, 2, 5),
new TestClass(5, 1, 1),
new TestClass(6, 2, 5)
};
int running_total = 0;
var result_set = list.Select(x => new { x.RowNumber, x.RowType, running_total = (running_total = running_total + x.RowValue) }).ToList();
foreach (var v in result_set)
{
Console.WriteLine("list element: {0}, total so far: {1}",
v.RowNumber,
v.running_total);
}
Console.ReadLine();
}
}
public class TestClass
{
public TestClass(int rowNumber, int rowType, int rowValue)
{
RowNumber = rowNumber;
RowType = rowType;
RowValue = rowValue;
}
public int RowNumber { get; set; }
public int RowType { get; set; }
public int RowValue { get; set; }
}
}
Upvotes: 1
Views: 1184
Reputation: 26927
Your answer can be simplified greatly, but does scale poorly even then, as it must go through the Where
for each row to compute each row, so O(list.Count^2
).
Here is the simpler version, which preserves the original order:
var result = list.Select(item => new {
RowType = item.RowType,
RowValue = list.Where(prior => prior.RowNumber <= item.RowNumber && prior.RowType == item.RowType).Sum(prior => prior.RowValue)
});
You can go through list
once if are willing to sort. (If you know the order is correct, or can use a simpler sort, you can remove or replace the OrderBy
/ThenBy
.)
var ans = list.OrderBy(x => x.RowType)
.ThenBy(x => x.RowNumber)
.Scan(first => new { first.RowType, first.RowValue },
(res, cur) => res.RowType == cur.RowType ? new { res.RowType, RowValue = res.RowValue + cur.RowValue }
: new { cur.RowType, cur.RowValue }
);
This answer uses an extension method that is like Aggregate
, but returns the intermediate results, based on the APL scan operator:
// TRes seedFn(T FirstValue)
// TRes combineFn(TRes PrevResult, T CurValue)
public static IEnumerable<TRes> Scan<T, TRes>(this IEnumerable<T> src, Func<T, TRes> seedFn, Func<TRes, T, TRes> combineFn) {
using (var srce = src.GetEnumerator()) {
if (srce.MoveNext()) {
var prev = seedFn(srce.Current);
while (srce.MoveNext()) {
yield return prev;
prev = combineFn(prev, srce.Current);
}
yield return prev;
}
}
}
Upvotes: 2
Reputation: 15185
My eyes were glazed over after seeing this. The answer to my long winded question after 6 hours of skulldrugery seems to be as simple as this. Thanks to @NetMage for pointing out the SelectMany that I was missing.
var result = list.SelectMany(item => list.Where(x => x.RowNumber <= item.RowNumber && x.RowType == item.RowType)
.GroupBy(g => g.RowType)
.Select(p => new
{
RowType = p.Max(s => s.RowType),
RowValue = p.Sum(s => s.RowValue)
}));
Upvotes: 0