Donald Seinen
Donald Seinen

Reputation: 4419

Fast mapreduce using multiple ops

For collections x and y I want to call a function f, and reduce the results using operators from a collection red.

x = (1, 2, 3, 4, 5)
y = (2, 4.0, 6, 8.0, 10)
red = (/, *, +, -)

f(x, y::Int) = float(x + y)
f(x, y::AbstractFloat) = x - y

That is, I want the result of

f(1, 2) / f(2, 4.0) * f(3, 6) + f(4, 8.0) - f(5, 10)   # -32.5

The following works but allocates and is ~100x slower than the naive code above.

function mapop(f, ops, itrs...)
    i::Int = 0
    op(x, y) = (i += 1; ops[i](x, y))
    mapreduce(f, op, itrs...)
end

mapop(f, red, x, y)
#> -32.5

@code_warntype mapop(f, red, x, y) shows type instability. mapreduce seems to allocate regardless, @btime mapreduce($*, $+, $1:3, $5:7).

Question
Is there a fast, preferably non-allocating (alternative / way) to mapreduce using multiple functions?

The following loop alternative avoids 30 allocations, and runs faster than mapop, however, I fail in getting rid of all allocations.

function mapop2(x, y, f, ops)
    val = ntuple(i -> Float64(f(x[i], y[i])), length(x))
    res = first(val)
    for j in eachindex(ops)
        res = ops[j](res, val[j+1])
    end
    res
end

Upvotes: 1

Views: 60

Answers (3)

Donald Seinen
Donald Seinen

Reputation: 4419

For posterity,

  • ::Function is abstract, passing a tuple might be slow or hard to reason with about,
  • mixing Int and AbstractFloat is a code smell

I ended up using a tuple red = (4, 3, 1, 2) to represent (/, *, +, -) and a loop instead.

red = (4, 3, 1, 2)
function p(x, y, f, ops)
    res = float(f(x[1], y[1]))
    its = ntuple(i -> float(f(x[i + 1], y[i + 1])), length(ops))
    for i in eachindex(its)
        if ops[i] == 1
            res += its[i]
        elseif ops[i] == 2
            res -= its[i]
        elseif ops[i] == 3
            res *= its[i]
        elseif ops[i] == 4
            res /= its[i]
        end
    end
    res
end

@btime p($x, $y, $f, $red)
#> 7.200 ns (0 allocations: 0 bytes)

Upvotes: 0

Dan Getz
Dan Getz

Reputation: 18217

Another method could be:

x = (1, 2, 3, 4, 5)
y = (2, 4.0, 6, 8.0, 10)
ered = (max, /, *, +, -)

f(x, y::Int) = float(x + y)
f(x, y::AbstractFloat) = x - y

foldeval(f, r, x, y) = 
  foldl((r,(x,y,op))->op(r,f(x,y)),zip(x,y,red); init=-Inf)

giving:

julia> foldeval(f, ered, x, y)
-32.5

julia> @btime foldeval($f, $ered, $x, $y)
  646.577 ns (15 allocations: 432 bytes)
-32.5

(note the extra operation making red into ered)

Upvotes: 2

Ashlin Harris
Ashlin Harris

Reputation: 422

Passing i as an argument to op fixes the warning.

Additionally, some of the slowdown might come from using the splat operator ..., which makes the form of mapop uncertain. On my machine, the following runs a bit more quickly than the original:

function mapop(f, ops, itr1, itr2)
           i::Int = 0
           op(x, y) = (i += 1; ops[i](x, y))
           mapreduce(f, op, itr1, itr2)
       end

Upvotes: 1

Related Questions