Reputation: 1447
Folks,
Consider the following example, given a list of Trade objects my code needs to return an array containing trade volume for 24 hours, 7 days, 30 days and all times.
Using plain old iterator this requires only a single iteration over the collection.
I'm trying to do the same using a Java 8 streams and Lambda expressions. I came up with this code, which looks elegant, works fine, but requires 4 iterations over the list:
public static final int DAY = 24 * 60 * 60;
public double[] getTradeVolumes(List<Trade> trades, int timeStamp) {
double volume = trades.stream().mapToDouble(Trade::getVolume).sum();
double volume30d = trades.stream().filter(trade -> trade.getTimestamp() + 30 * DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
double volume7d = trades.stream().filter(trade -> trade.getTimestamp() + 7 * DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
double volume24h = trades.stream().filter(trade -> trade.getTimestamp() + DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
return new double[]{volume24h, volume7d, volume30d, volume};
}
How can I achieve the same using only a single iteration over the list ?
Upvotes: 8
Views: 1090
Reputation: 1447
Thanks Brian, I ended up implementing the code below, it's not as simple as I hoped but at least it iterates only once, its parallel ready and it passes my unit tests. Any improvements ideas are welcomed.
public double[] getTradeVolumes(List<Trade> trades, int timeStamp) {
TradeVolume tradeVolume = trades.stream().collect(
() -> new TradeVolume(timeStamp),
TradeVolume::accept,
TradeVolume::combine);
return tradeVolume.getVolume();
}
public static final int DAY = 24 * 60 * 60;
static class TradeVolume {
private int timeStamp;
private double[] volume = new double[4];
TradeVolume(int timeStamp) {
this.timeStamp = timeStamp;
}
public void accept(Trade trade) {
long tradeTime = trade.getTimestamp();
double tradeVolume = trade.getVolume();
volume[3] += tradeVolume;
if (!(tradeTime + 30 * DAY > timeStamp)) {
return;
}
volume[2] += tradeVolume;
if (!(tradeTime + 7 * DAY > timeStamp)) {
return;
}
volume[1] += tradeVolume;
if (!(tradeTime + DAY > timeStamp)) {
return;
}
volume[0] += tradeVolume;
}
public void combine(TradeVolume tradeVolume) {
volume[0] += tradeVolume.volume[0];
volume[1] += tradeVolume.volume[1];
volume[2] += tradeVolume.volume[2];
volume[3] += tradeVolume.volume[3];
}
public double[] getVolume() {
return volume;
}
}
Upvotes: 1
Reputation: 95346
This problem is similar to the "summary statistics" collector. Take a look at the IntSummaryStatistics
class:
public class IntSummaryStatistics implements IntConsumer {
private long count;
private long sum;
...
public void accept(int value) {
++count;
sum += value;
min = Math.min(min, value);
max = Math.max(max, value);
}
...
}
It is designed to work with collect()
; here's the implementation of IntStream.summaryStatistics()
public final IntSummaryStatistics summaryStatistics() {
return collect(IntSummaryStatistics::new, IntSummaryStatistics::accept,
IntSummaryStatistics::combine);
}
The benefit of writing a Collector
like this is then your custom aggregation can run in parallel.
Upvotes: 9
Reputation: 31648
It might be possible to use a Collectors.groupingBy
method to partition the data however the equation would be complicated and not intent revealing.
Since getTimestamp()
is an expensive operation, it is probably best to keep it as a pre-Java 8 iteration so you only have to calculate the value once per Trade
.
Just because Java 8 adds shiny new tools, don't try to turn it into a hammer to hammer in all nails.
Upvotes: 0