vikram-s-narayan
vikram-s-narayan

Reputation: 578

Row-wise operations between matrices in Julia

I'm attempting to translate the equivalent of the following Python code (from SMT GEKPLS) into Julia:

def differences(X, Y):
    D = X[:, np.newaxis, :] - Y[np.newaxis, :, :]
    return D.reshape((-1, X.shape[1]))

So, given an input like this:

X = np.array([[1.0,1.0,1.0], [2.0,2.0,2.0]])
Y = np.array([[1.0,2.0,3.0], [4.0,5.0,6.0], [7.0,8.0,9.0]])
diff = differences(X,Y)

We get an output (diff) that looks like this:

[[ 0. -1. -2.]
 [-3. -4. -5.]
 [-6. -7. -8.]
 [ 1.  0. -1.]
 [-2. -3. -4.]
 [-5. -6. -7.]]

What is an efficient way to do this with Julia code? I expect the X and Y input matrices to be quite large.

Upvotes: 2

Views: 749

Answers (3)

AboAmmar
AboAmmar

Reputation: 5559

This might be faster than other alternatives, while still being easy to understand.

[x .- y for x ∈ X for y ∈ Y]

6-element Vector{Vector{Float64}}:
 [0.0, -1.0, -2.0]
 [-3.0, -4.0, -5.0]
 [-6.0, -7.0, -8.0]
 [1.0, 0.0, -1.0]
 [-2.0, -3.0, -4.0]
 [-5.0, -6.0, -7.0]

The one thing I disliked about numpy is that one has to exactly remember each function in conjunction with a combination of input parameters. In Julia, the traditional loop can serve as an efficient drop-in replacement for most algorithms.

Addendum: The above might be the fastest solution as I said, provided that working with a Vector{Vector{Float64}} is not an issue. If it is, here is another solution that outputs a Matrix{Float64} while being fast as well.

function diffr(X,Y) 
    i, l, m, n = 0, length(first(X)), length(X), length(Y)
    Z = Matrix{Float64}(undef, m*n, l)
    for x in X, y in Y
        Z[i+=1,:] .= x .- y
    end
    Z
end

And here is a performance comparison of all posted solutions on my computer.

@btime [x.-y for x∈$X for y∈$Y] # 312.245 ns (9  allocations: 656 bytes)
@btime diffr($X, $Y)            #  73.868 ns (1  allocation:  208 bytes)
@btime differences($X, $Y)      # 439.000 ns (12 allocations: 896 bytes)
@btime diffs_row($X, $Y)        # 463.131 ns (11 allocations: 784 bytes)

Upvotes: 1

DNF
DNF

Reputation: 12654

Here's a version that avoids repeat, which creates unnecessary data duplication:

function diffs_row(X, Y)
    N = size(X, 2)
    return reshape(reshape(X', 1, N, :) .- Y', N, :)'
end

The reason for all the adjoints ' is that it isn't really natural to operate row-wise in Julia. Julia arrays are column-major so reshape will retrieve data column-wise. If you decide instead to change the orientation of the data, you could write

function diffs_col(X, Y)
    N = size(X, 1)
    return reshape(reshape(X, N, 1, :) .- Y, N, :)
end

instead.

One often sees this when translating numpy code to Julia. Numpy is natively row-major, so the translation becomes a bit awkward. You should consider changing your data layout to be column major in many cases.

Upvotes: 1

Elias Carvalho
Elias Carvalho

Reputation: 296

After some thinking, I came to this function:

function differences(X, Y)
    Rx = repeat(X, inner=(size(Y, 1), 1))
    Ry = repeat(Y, size(X, 1))
    Rx - Ry
end

I hope I was helpful.

Upvotes: 1

Related Questions