sciencectn
sciencectn

Reputation: 1523

Julia: nested loops for pairwise distances is really slow

I have some code which loads a csv file of 2000 2D coordinates, then a function called collision_count counts the number of pairs of coordinates that are closer than a distance d of each other:

using BenchmarkTools
using CSV
using LinearAlgebra

function load_csv()::Array{Float64,2}
    df = CSV.read("pos.csv", header=0)
    return Matrix(df)'
end

function collision_count(pos::Array{Float64,2}, d::Float64)::Int64
    count::Int64 = 0
    N::Int64 = size(pos, 2)
    for i in 1:N
        for j in (i+1):N
            @views dist = norm(pos[:,i] - pos[:,j])
            count += dist < d
        end
    end
    return count
end

Here are the results:

pos = load_csv()

@benchmark collision_count($pos, 2.0)
BenchmarkTools.Trial: 
  memory estimate:  366.03 MiB
  allocs estimate:  5997000
  --------------
  minimum time:     152.070 ms (18.80% GC)
  median time:      158.915 ms (20.60% GC)
  mean time:        158.751 ms (20.61% GC)
  maximum time:     181.726 ms (21.98% GC)
  --------------
  samples:          32
  evals/sample:     1

This is about 30x slower than this Python code:

import numpy as np
import scipy.spatial.distance

pos = np.loadtxt('pos.csv',delimiter=',')

def collision_count(pos, d):
    pdist = scipy.spatial.distance.pdist(pos)
    return np.count_nonzero(pdist < d)

%timeit collision_count(pos, 2)

5.41 ms ± 63 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Any way to make it faster? And what's up with all the allocations?

Upvotes: 4

Views: 346

Answers (2)

Oscar Smith
Oscar Smith

Reputation: 6378

The fastest I can get trivially is the follows

using Distances
using StaticArrays
using LinearAlgebra

pos = [@SVector rand(2) for _ in 1:2000]
function collision_count(pos::Vector{<:AbstractVector}, d)
    count = 0
    @inbounds for i in axes(pos,2)
        for j in (i+1):lastindex(pos,2)
            dist = sqeuclidean(pos[i], pos[j])
            count += dist < d*d
        end
    end
    return count
end

There are a variety of changes here, some stylistic, some structural. Starting with style, you may note that I don't type anything more restrictively than I need to. This has no performance benefit, since Julia is smart enough to infer types for your code.

The biggest structural change is switching from using a matrix to a vector of StaticVectors. The reason for this change is that since points are your scalar type, it makes more sense to have a vector of elements where each element is a point. The next change I made is to use a squared norm, since sqrt operations are expensive. The results speak for themselves:

@benchmark collision_count($pos, .1)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.182 ms (0.00% GC)
  median time:      1.214 ms (0.00% GC)
  mean time:        1.218 ms (0.00% GC)
  maximum time:     2.160 ms (0.00% GC)
  --------------
  samples:          4101
  evals/sample:     1

Note that there are n log(n) algorithms that may be faster, but this should be pretty close to optimal for a naive implementation.

Upvotes: 5

DNF
DNF

Reputation: 12644

Here's a solution that doesn't rely on specific knowledge about the dimensionality of the points:

(Edit: I updated the function to make it more robust with respect to indexing. Some AbstractArrays have indices that do not start at 1, so now I use axes and lastindex instead of size.)

function collision_count2(pos::AbstractMatrix, d)
    count = 0
    @inbounds for i in axes(pos, 2)
        for j in (i+1):lastindex(pos, 2)
            dist2 = sum(abs2(pos[k, i] - pos[k, j]) for k in axes(pos, 1))
            count += dist2 < d^2
        end
    end
    return count
end

Benchmarks:

julia> using BenchmarkTools

julia> @btime collision_count(pos, 0.7) setup=(pos=rand(2, 2000));
  533.055 ms (13991005 allocations: 488.01 MiB)

julia> @btime collision_count2(pos, 0.7) setup=(pos=rand(2, 2000));
  4.700 ms (0 allocations: 0 bytes)

The speed is actually close to the SVector solution. On the upcoming Julia version 1.5, the difference compared to the OP's code should be much smaller, since views become more efficient.

BTW: drop the type annotations, like these

count::Int64 = 0
N::Int64 = size(pos, 2)

it's just adding visual noise.

Upvotes: 1

Related Questions