Reputation: 578
I'm attempting to translate the equivalent of the following Python code (from SMT GEKPLS) into Julia:
def differences(X, Y):
D = X[:, np.newaxis, :] - Y[np.newaxis, :, :]
return D.reshape((-1, X.shape[1]))
So, given an input like this:
X = np.array([[1.0,1.0,1.0], [2.0,2.0,2.0]])
Y = np.array([[1.0,2.0,3.0], [4.0,5.0,6.0], [7.0,8.0,9.0]])
diff = differences(X,Y)
We get an output (diff) that looks like this:
[[ 0. -1. -2.]
[-3. -4. -5.]
[-6. -7. -8.]
[ 1. 0. -1.]
[-2. -3. -4.]
[-5. -6. -7.]]
What is an efficient way to do this with Julia code? I expect the X and Y input matrices to be quite large.
Upvotes: 2
Views: 749
Reputation: 5559
This might be faster than other alternatives, while still being easy to understand.
[x .- y for x ∈ X for y ∈ Y]
6-element Vector{Vector{Float64}}:
[0.0, -1.0, -2.0]
[-3.0, -4.0, -5.0]
[-6.0, -7.0, -8.0]
[1.0, 0.0, -1.0]
[-2.0, -3.0, -4.0]
[-5.0, -6.0, -7.0]
The one thing I disliked about numpy
is that one has to exactly remember each function in conjunction with a combination of input parameters. In Julia, the traditional loop can serve as an efficient drop-in replacement for most algorithms.
Addendum: The above might be the fastest solution as I said, provided that working with a Vector{Vector{Float64}}
is not an issue. If it is, here is another solution that outputs a Matrix{Float64}
while being fast as well.
function diffr(X,Y)
i, l, m, n = 0, length(first(X)), length(X), length(Y)
Z = Matrix{Float64}(undef, m*n, l)
for x in X, y in Y
Z[i+=1,:] .= x .- y
end
Z
end
And here is a performance comparison of all posted solutions on my computer.
@btime [x.-y for x∈$X for y∈$Y] # 312.245 ns (9 allocations: 656 bytes)
@btime diffr($X, $Y) # 73.868 ns (1 allocation: 208 bytes)
@btime differences($X, $Y) # 439.000 ns (12 allocations: 896 bytes)
@btime diffs_row($X, $Y) # 463.131 ns (11 allocations: 784 bytes)
Upvotes: 1
Reputation: 12654
Here's a version that avoids repeat
, which creates unnecessary data duplication:
function diffs_row(X, Y)
N = size(X, 2)
return reshape(reshape(X', 1, N, :) .- Y', N, :)'
end
The reason for all the adjoints '
is that it isn't really natural to operate row-wise in Julia. Julia arrays are column-major so reshape
will retrieve data column-wise. If you decide instead to change the orientation of the data, you could write
function diffs_col(X, Y)
N = size(X, 1)
return reshape(reshape(X, N, 1, :) .- Y, N, :)
end
instead.
One often sees this when translating numpy code to Julia. Numpy is natively row-major, so the translation becomes a bit awkward. You should consider changing your data layout to be column major in many cases.
Upvotes: 1
Reputation: 296
After some thinking, I came to this function:
function differences(X, Y)
Rx = repeat(X, inner=(size(Y, 1), 1))
Ry = repeat(Y, size(X, 1))
Rx - Ry
end
I hope I was helpful.
Upvotes: 1