Riyujin
Riyujin

Reputation: 13

Optimizing a grid of dots with regard to distances (python)

I'm facing an issue on which I'm stuck between time of calculus and constrains...

  1. Easy case :

I have a grid of dots somewhat equally distributed onto a 3D grid (first picture).

NB : you don't see it there but there is a 3rd axis perpendicular to the grid - there are 125 dots (5 x 5 x 5 for each side)

3D grid with dots distributed ~ equally

My goal is, taking a distance given by noise statistics (we will take 0.03 for this case), to fill the grid with those conditions :

Here is an example of a filled grid :

Badly filled grid (still in 3D)

The thing is, here there are a lot of white space which i would like to remove.

For more details, here is a plot of all the distances from one point to its closest neighbour :

Histogram of distances from dots their closest neighbour

The red rectangle is where i would like all the blue bars to stack, between 0.03 - 10% and 0.03 + 10% in this case.

The major problem is that, because it's randomly generated, I have to generate a lot of dots to hope that even only one will fit in the constraint. Here it's a relatively "easy case" because afterwards, I will have to apply this on :

  1. Hard case :

    • Dots not equally distributed into a base of more than 10 axis
    • Number of dots up to 1000 and more

I'm not a crack in python, I only know the basics (numpy, matplotlib.pyplot), and am pretty comfortable with charts, arrays, loops, however my code is not optimized in terms of time of calculus for those kind of tasks as :

If anyone has a simple idea or some advices, it would be nice, I also hope I did explain well (I tried to keep the most important stuff only, and tried to write in proper english also :) ).

Thanks for reading,

Nathan.

EDIT : Simplified code here, as it will probably help answering more easily ! I directly added the non equally distributed grid (A, B).

import numpy as np
import matplotlib.pyplot as plt
plt.close("all")

A = np.random.rand(100)
B = np.random.rand(100)

plt.figure()
plt.plot(A, B,".")
plt.grid()

Nfillers = 10000
SNRD = 0.03

for i in range(Nfillers):
    a, b = np.random.rand(2)
    A = np.hstack((A, np.atleast_1d(a)))
    B = np.hstack((B, np.atleast_1d(b)))
    DCN = 1E+308 # Distance Closest Neighbor
    for j in range(len(A)): # len(A) is updated with each iteration
        D1 = A[j] - a
        D2 = B[j] - a
        distance = np.sqrt(D1**2 + D2**2)
        if distance != 0:
            if distance < DCN:
                DCN = distance
    if DCN < 0.9*SNRD or DCN > 1.1*SNRD:
        A = np.delete(A, len(A)-1)
        B = np.delete(B, len(B)-1)

    print i

plt.figure()
plt.plot(A, B, ".")
plt.grid()

EDIT 2: Quicker code :

import numpy as np
import matplotlib.pyplot as plt

A = np.random.rand(100)
B = np.random.rand(100)

plt.figure()
plt.plot(A, B,".")
plt.grid()

Nfillers = 100000
SNRD = 0.03

for i in range(Nfillers):
    a, b = np.random.rand(2)
    DCN = np.min(np.sqrt((A - a)**2 + (B - b)**2))
    if DCN > 0.9*SNRD and DCN < 1.1*SNRD:
        A = np.hstack((A, np.atleast_1d(a)))
        B = np.hstack((B, np.atleast_1d(b)))

    print(i)

plt.figure()
plt.plot(A, B, ".")
plt.grid()

Upvotes: 0

Views: 1042

Answers (1)

jawknee
jawknee

Reputation: 178

You haven't included any code but here is one rough idea of how to speed up these sorts of problems (computing distance between many points).

Partition the space into blocks roughly the size of the distance required, d. This could be a mapping of the block number to the list of points E.g. blocks[i,j,k] = list_of_points. Then when you generate a new point, you only need to check adjacent blocks. Something like:

new_point = random_point()
i, j, k = get_block_indices(new_point)

for ai in (i-1, i, i+1):
    for aj in (j-1, j, j+1):
        for ak in (k-1, k, k+1):
            for other_point in blocks[ai, aj, ak]:
                # check whether other_point is too close
                found_close_other_point = check(new_point, other_point, d)
                if found_close_other_point:
                    return  # start again

# only reach if we haven't found a close point
try:
    blocks[i, j, k].append(new_point)
except KeyError:
    # create new block
    blocks[i, j, k] = [new_point]

There's a trade-off between how many blocks you want to store and how many neighbours you need to check. Some additional points:

  • You don't want to store every block, instead just handle when the block doesn't exist yet (KeyError) as being empty. In the worst case you will need as many blocks as points if you do it lazily like this.

  • You could use itertools.product to avoid the nested for loop.

  • If you play with the size of blocks you could also introduce extra skipping conditions, e.g. if you know you likely can't fit more than n points in a block, then you just need to check len(block[i, j, k]) > n.

  • A pure python implementation might not be fastest, you could look into speed things up using http://numba.pydata.org/, which is very easy and fast for these sorts of numeric problem whilst still reading like python.


EDIT: an implementation.

import itertools
import numpy as np
from math import ceil
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D


def compute_min_distances(ps):
    """Compute the minimum distances between all points ``ps``,
    with shape (num_points, num_dims).
    """
    ndim = ps.shape[-1]
    p1, p2 = ps.reshape(-1, 1, ndim), ps.reshape(1, -1, ndim)
    dd = np.sum((p1 - p2)**2, -1)**0.5
    # want to ignore distance to self
    np.fill_diagonal(dd, np.inf)
    return np.min(dd, axis=0)


def get_block_indices(p, B):
    """Given B blocks, get the indices of ``p``.
    """
    return [int(c * B) for c in p]


def init_blocks(ps, B):
    """Set up the initial block mapping.
    """
    blocks = {}
    for p in ps:
        i, j, k = get_block_indices(p, B)
        try:
            blocks[i, j, k].append(p)
        except KeyError:
            blocks[i, j, k] = [p]
    return blocks


def distance(p1, p2):
    """Distance bewtween two points.
    """
    return sum((a - b)**2 for a, b in zip(p1, p2))**0.5


def gen_close_points(p0s, D=0.03, N=1000):
    """Takes ``p0s`` and generates ``N`` new points with each of 
    which has a closest neighbour +- 10% of ``D``.
    """
    # number of blocks required so that only neigbours matter
    B = ceil(1 / D)

    # mapping of blocks to points
    blocks = init_blocks(p0s, B)

    n = 0
    while n < N:
        new_p = np.random.rand(3)
        i, j, k = get_block_indices(new_p, B)

        neighbour_inds = [(b - 1, b, b + 1) for b in (i, j, k)]

        def gen_neighbours():
            for ai, aj, ak in itertools.product(*neighbour_inds):
                try:
                    # try yielding each point
                    yield from blocks[ai, aj, ak]
                except KeyError:
                    # skip if empty
                    continue

        ds = (distance(new_p, p) for p in gen_neighbours())
        min_d = min(ds, default=0)

        if 0.9 * D < min_d < 1.1 * D:
            try:
                blocks[i, j, k].append(new_p)
            except KeyError:
                blocks[i, j, k] = [new_p]
            n += 1

    # combine into array of all points
    return np.concatenate(tuple(itertools.chain(blocks.values())))

Let's give it a whirl with some initial points:

p0s = np.random.rand(100, 3)
d0s = compute_min_distances(p0s)
plt.hist(d);

enter image description here

Now let's generate 10000 more points, each with closest distance specified:

ps = gen_close_points(p0s, 0.03, 10000)

And check the distances:

plt.hist(compute_min_distances(ps));

enter image description here

Looks good! Finally lets plot the old points and new points together:

fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(111, projection='3d')
ax.scatter3D(ps[:, 0], ps[:, 1], ps[:, 2], alpha=0.1)
ax.scatter3D(p0s[:, 0], p0s[:, 1], p0s[:, 2])

enter image description here

This currently assumes that the data is 3-dimensional with range (0, 1) but should be a decent starting point for generalizing it.

Upvotes: 1

Related Questions