Reputation: 12087
This is to process stock data; the data is in this format:
public class A
{
public int Price;
public int Available;
}
let's take this data for example:
var items = new List<A>
{
new A { Price = 10, Available = 1000 },
new A { Price = 15, Available = 500 },
new A { Price = 20, Available = 2000 },
};
my query returns the average price for a specific volume, so for example:
if I have a requested volume of 100, my average price is 10
if I have a requested volume of 1200, I take the first 1000 at a price of 10, then the next 200 at a price of 15 etc
I have implemented that in C#, but I am trying to find if this could be done with LINQ directly with the database iterator.
I get data that is already sorted by price, but I don't see how to solve this without iteration.
Edit:
this is the code:
public static double PriceAtVolume(IEnumerable<A> Data, long Volume)
{
var PriceSum = 0.0;
var VolumeSum = 0L;
foreach (var D in Data)
{
if (D.Volume < Volume)
{
PriceSum += D.Price * D.Volume;
VolumeSum += D.Volume;
Volume -= D.Volume;
}
else
{
PriceSum += D.Price * Volume;
VolumeSum += Volume;
Volume = 0;
}
if (Volume == 0) break;
}
return PriceSum / VolumeSum;
}
and the test code:
var a = new List<A>
{
new A { Price = 10, Volume = 1000 },
new A { Price = 15, Volume = 500 },
new A { Price = 20, Volume = 2000 }
};
var P0 = PriceAtVolume(a, 100);
var P1 = PriceAtVolume(a, 1200);
Clarification:
Above I said I'd like to move it to LINQ to use the database iterator, so I'd like to avoid scanning the entire data and stop iterating when the answer is calculated. The data is already sorted by price in the database.
Upvotes: 4
Views: 2129
Reputation: 26907
I think the best you can do with LINQ is minimize the running total computation done on the server and compute most of it on the client, but minimize the amount downloaded from the server.
I assume the items
are already projected down to the two minimum columns (Price
, Availability
). If not, a Select
can be added before pulling the data from the database into orderedItems
.
// find price of last item needed; worst case there won't be one
var lastPriceItem = items.Select(i => new { i.Price, RT = items.Where(it => it.Price <= i.Price).Sum(it => it.Available) }).FirstOrDefault(irt => irt.RT > origReqVol);
// bring over items below that price
var orderedItems = items.OrderBy(i => i.Price).Where(i => i.Price <= lastPriceItem.Price).ToList();
// compute running total on client
var rtItems = orderedItems.Select(i => new {
Item = i,
RT = orderedItems.Where(i2 => i2.Price <= i.Price).Sum(i2 => i2.Available)
});
// computer average price
var reqVol = origReqVol;
var ans = rtItems.Select(irt => new { Price = irt.Item.Price, Quantity = Math.Min((reqVol -= irt.Item.Available)+irt.Item.Available, irt.Item.Available) })
.Sum(pq => pq.Price * pq.Quantity) / (double)origReqVol;
Upvotes: 0
Reputation: 43399
This is probably the most Linqy you can get. It uses the Aggregate
method, and specifically the most complex of the three overloaded versions of Aggregate
, that accepts three arguments. The first argument is the seed, and it is initialized with a zeroed ValueTuple<long, decimal>
. The second argument is the accumulator function, with the logic to combine the seed and the current element into a new seed. The third argument takes the final accumulated values and projects them to the desirable average.
public static decimal PriceAtVolume(IEnumerable<A> data, long requestedVolume)
{
return data.Aggregate(
(Volume: 0L, Price: 0M), // Seed
(sum, item) => // Accumulator function
{
if (sum.Volume == requestedVolume)
return sum; // Goal reached, quick return
if (item.Available < requestedVolume - sum.Volume)
return // Consume all of it
(
sum.Volume + item.Available,
sum.Price + item.Price * item.Available
);
return // Consume part of it (and we are done)
(
requestedVolume,
sum.Price + item.Price * (requestedVolume - sum.Volume)
);
},
sum => sum.Volume == 0M ? 0M : sum.Price / sum.Volume // Result selector
);
}
Update: I changed the return type from double to decimal, because a decimal is the preferred type for currency values.
Btw in case that this function is called very often with the same data, and the list of data is huge, it could be optimized by storing the accumulated summaries in a List<(long, decimal)>
, and applying BinarySearch
to quickly find the desirable entry. It becomes complex though, and I don't expect that the prerequisites for the optimization will come up very often.
Upvotes: 3
Reputation: 94
You could do something to generate the items' prices as a sequence. e.g.
public class A
{
public int Price;
public int Available;
public IEnumerable<int> Inv => Enumerable.Repeat(Price, Available);
}
var avg1 = items.SelectMany(i => i.Inv).Take(100).Average(); // 10
var avg2 = items.SelectMany(i => i.Inv).Take(1200).Average(); // 10.8333333333333
Upvotes: 0
Reputation: 5635
This is working as well (although not a one-liner):
private static decimal CalculateWeighedAverage(List<A> amountsAndPrices, int requestedVolume)
{
int originalRequestedVolume = requestedVolume;
return (decimal)amountsAndPrices.Sum(amountAndPrice =>
{
int partialResult = Math.Min(amountAndPrice.Available, requestedVolume) * amountAndPrice.Price;
requestedVolume = Math.Max(requestedVolume - amountAndPrice.Available, 0);
return partialResult;
}) / originalRequestedVolume;
}
Take the sum of price * available as long as the requested volume is bigger than 0 and subtracting the amount of every item in the list in each "sum iteration". Finally divide by the original requested volume.
Upvotes: 0