aggregating multiple values at once

Question

So I'm running into a speed issue where I have a dataset that needs to be aggregated multiple times.

Initially my team had set up three accumulators and were running a single foreach loop over the data. Something along the lines of

val accum1:Accumulable[a]
val accum2: Accumulable[b]
val accum3: Accumulable[c]

data.foreach{
        u => 
                accum1+=u
                accum2 += u
                accum3 += u 
}

I am trying to switch these accumulations into an aggregation so that I can get a speed boost and have access to accumulators for debugging. I am currently trying to figure out a way to aggregate these three types at once, since running 3 separate aggregations is significantly slower. Does anyone have any thoughts as to how I can do this? Perhaps aggregating agnostically then pattern matching to split into two RDDs?

Thank you

aggregating multiple values at once

Answers (1)

Related Questions