numpy - distances between two points from vectors of shape(n, 2)

Question

Background

Suppose I have a ndarray of coordinates of shape (n, 2) where each coordinate is (x, y).

X = np.random.random(shape) * 10  # just to generate (x,y).
---
[[9.47743968 8.60682597]
 [7.35620992 6.87031756]
 [5.05200433 3.62373581]
 [4.33732145 3.72994235]
 [4.34982473 4.46453609]]
...

A distance between two vectors Xi and Xj in X is |Xj - Xi|. Get the distances of all combinations in X could be done as in the image.

Question

Is it possible to do this with numpy only, not scipy (e.g. scipy.spatial.distance.pdist(X, metric='euclidean', *args, **kwargs)? Please help understand available numpy functions and how to achieve it.

Research

SciPy

I though I could look into scipy code as scipy.spatial.distance.pdist(X, metric='euclidean', *args, **kwargs) seems to get the distances from vectors.

Pairwise distances between observations in n-dimensional space.

import scipy.spatial
scipy.spatial.distance.pdist(X)
---
array([2.74136411, 6.6645079 , 7.08553522, 6.59173729, 3.9811627 ,
       4.35610423, 3.85047223, 0.72253128, 1.09544571, 0.73470014])

scipy distance.py line 2050-2059 appears to be the code that calls a distance function for a corresponding method (e.g. euclidean). However it goes into a C code distance_wrap.c, hence not numpy.

static PyObject *pdist_seuclidean_double_wrap(PyObject *self, PyObject *args, 
                                              PyObject *kwargs) 
{
  PyArrayObject *X_, *dm_, *var_;
  int m, n;
  double *dm;
  const double *X, *var;
  static char *kwlist[] = {"X", "dm", "V", NULL};
  if (!PyArg_ParseTupleAndKeywords(args, kwargs, 
            "O!O!O!:pdist_seuclidean_double_wrap", kwlist,
            &PyArray_Type, &X_,
            &PyArray_Type, &dm_,
            &PyArray_Type, &var_)) {
    return 0;
  }
  else {
    NPY_BEGIN_ALLOW_THREADS;
    X = (double*)X_->data;
    dm = (double*)dm_->data;
    var = (double*)var_->data;
    m = X_->dimensions[0];
    n = X_->dimensions[1];

    pdist_seuclidean(X, var, dm, m, n);
    NPY_END_ALLOW_THREADS;
  }
  return Py_BuildValue("d", 0.0);
}

kkcmz · Accepted Answer

shape = (10,2)
X = np.random.random(shape)
# X[:,0] --> x values
# X[:,1] --> y values
dist = np.sqrt(np.sum((X[:,np.newaxis,:] - X[np.newaxis,:,:]) ** 2, axis = -1))

numpy - distances between two points from vectors of shape(n, 2)

Background

Question

Research

SciPy

Answers (2)

Related Questions