moe asal
moe asal

Reputation: 582

Linear time algorithm for computing radius of membership hyper-sphere

We are given a Graph, G(V, E), where V is the node set and E is the edge set consisting of ordered tuples (u, v). The graph is undirected, as such, if (u,v) is in E, then (v, u) is in E.

Alongside the graph, you are given an embedding of the graph in k-d space. Let the embedding function be P: V -> R^k.

For each node v in the graph, I want to compute its corresponding r such that: for all u, if Dist(P(v), P(u)) < r then (v, u) in E

In other terms, for every node v in the graph, I want to define a hyper-sphere with center v and radius r, such that all the points inside the sphere correspond to neighbors of v. Points lying outside the sphere can be a neighbor of v, but I want to maximize r until it reaches a roadblock or covers all neighbors. r should be the minimum of {distance between v and its farthest neighbor, distance between v and its closest non-neighbor}.

One trivial way to do it is to compute the distance matrix and proceed from there. But computing that matrix requires O(n^2) time. Is it possible to do this in linear time or almost-linear time?

Upvotes: 3

Views: 108

Answers (2)

btilly
btilly

Reputation: 46497

I'm sure that this is impossible. Just scatter n vertices around randomly. Connect each vertex to the nearest half of vertices. Now remove one edge. I'm pretty sure that there is no way to find the two vertices whose r was just reduced without a brute force search of the edges, which takes O(k n^2).

If k is large, you additionally run into the Curse of Dimensionality. Any kind of smart search by distance now becomes prohibitive.

But in few dimensions, with few edges and small radii, you can improve on brute force. Just throw your vertices into a k-d tree and then do an expanding nearest neighbors search until you find the last edge, or the first non-edge. The efficiency of this search will depend on k. And again, its speedup can only work if relatively few vertices will be within distance r.

Upvotes: 2

user24714692
user24714692

Reputation: 4949

"Is it possible to do this in linear time or almost-linear time?"

  • I don't believe O(N) time complexity would be an option for solving this multi-dimensional graph problem.

  • O(k * N ^ 2) time complexity is possible. In the case of having infinite dimensions, that would even lead to an O(N ^ 3) time complexity.


Example

import numpy as np
from collections import defaultdict


def get_min_max(nodes, edges, vects):
    n, res = len(nodes), defaultdict(int)
    SE = set(edges)
    SE.update((v, u) for u, v in edges)

    D = np.linalg.norm(vects[:, np.newaxis] - vects, axis=2)

    for i, v in enumerate(nodes):
        dists = D[i]
        neis, others = set(), set()
        for j, u in enumerate(nodes):
            if u == v:
                continue
            if (v, u) in SE:
                neis.add(dists[j])
            else:
                others.add(dists[j])

        far = max(neis) if neis else float('inf')
        near = min(others) if others else float('inf')

        res[v] = (max(far, near), min(far, near))

    return res


nodes = ('a', 'b', 'c', 'd', 'e', 'f', 'g')
edges = (('a', 'b'), ('a', 'e'), ('a', 'g'), ('b', 'c'), ('c', 'd'), ('c', 'g'), ('c', 'e'), ('d', 'e'),
         ('d', 'f'), ('f', 'd'), ('f', 'e'), ('f', 'c'), ('d', 'g'), ('g', 'b'), ('g', 'd'))
vects = np.array((
    (1, 0, 1),
    (1, 0, 1),
    (0, 1, 1),
    (0, 0, 1),
    (1, 1, 1),
    (1, 0, 1),
    (0, 1, 0),
))

print(get_min_max(nodes, edges, vects))

Prints

defaultdict(<class 'int'>, {'a': (1.7320508075688772, 0.0), 'b': (1.7320508075688772, 0.0), 'c': (1.4142135623730951, 1.4142135623730951), 'd': (1.4142135623730951, 1.0), 'e': (1.4142135623730951, 1.0), 'f': (1.4142135623730951, 0.0), 'g': (1.7320508075688772, 1.4142135623730951)})


Upvotes: 2

Related Questions