Nearest Neighbour Algorithm

Question

Okay, so I'm pretty new to programming. I'm using Python 2.7, and my next goal is to implement some light version of the Nearest Neighbour algorithm (note that I'm not talking about the k-nearest neighbour). I've tried many approaches, som of them close, but I still can't seem to nail it.

First of all, - I'm using an array to represent my incidens-matrix, is that a good idea ? I'm thinking of the numpys array and matrix here.

Second, - I know the algorithm (Pseudocode), it is pretty straight forward. But I could definitely use some sort of kickstart. I'm not interested in the complete implementation, but I'm stock for now, and I'm seeking some help.

Maybe the above isn't enaugh to clarify my problem, but feel free to ask.

Thank you in advance.

Okay, so now I have given it one more try. It seems I've got it I think, - anyway here is what I have done. I'm pretty happy with the result, since I'm new to this. But I'm sure you have som hints or improvements.

import numpy as np
import copy

'''
                        NEAREST NEIGHBOUR ALGORITHM
                        ---------------------------


The algorithm takes two arguments. The first one is an array, with elements
being lists/column-vectors from the given complete incidensmatrix. The second 
argument is an integer which represents the startingnode where 1 is the 
smallest. The program will only make sense, if the triangle inequality is satisfied.
Furthermore, diagonal elements needs to be inf. The pseudocode is listed below:


1. - stand on an arbitrary vertex as current vertex.
2. - find out the shortest edge connecting current vertex and an unvisited vertex V.
3. - set current vertex to V.
4. - mark V as visited.
5. - if all the vertices in domain are visited, then terminate.
6. - Go to step 2.

The sequence of the visited vertices is the output of the algorithm

Remark - infinity is entered as np.inf
'''



def NN(A, start):

    start = start-1 #To compensate for the python index starting at 0.
    n = len(A)
    path = [start]
    costList = []
    tmp = copy.deepcopy(start)
    B = copy.deepcopy(A)

    #This block eliminates the startingnode, by setting it equal to inf.
    for h in range(n):
        B[h][start] = np.inf

    for i in range(n):

        # This block appends the visited nodes to the path, and appends
        # the cost of the path.
        for j in range(n):
            if B[tmp][j] == min(B[tmp]):
                costList.append(B[tmp][j])
                path.append(j)
                tmp = j
                break

        # This block sets the current node to inf, so it can't be visited again.
        for k in range(n):
            B[k][tmp] = np.inf

    # The last term adds the weight of the edge connecting the start - and endnote.
    cost = sum([i for i in costList if i < np.inf]) + A[path[len(path)-2]][start]

    # The last element needs to be popped, because it is equal to inf.
    path.pop(n)

    # Because we want to return to start, we append this node as the last element.
    path.insert(n, start)

    # Prints the path with original indicies.
    path = [i+1 for i in path]

    print "The path is: ", path
    print "The cost is: ", cost
    print
    return ""

'''
If the desired result is to know the path and cost from every startnode,
then initialize the following method:
''' 
def every_node(A):
    for i in range(1, len(A)):
        print NN(A, i)
    return ""

Luke · Accepted Answer

Your solution is fine, but it can be simplified and at the same time made more efficient. If you are using numpy arrays, it is often the case that those small, inner for-loops can be replaced with just a few lines of code. The result should be shorter and run much faster because numpy uses compiled functions to do its work. It can take some time to get used to this style of programming--instead of looping over every element of an array, you operate on the entire array at once. It will help to read examples of this process; look for SO questions like "how do I make XX more efficient in numpy?".

Here is an example NN implementation:

import numpy as np
def NN(A, start):
    """Nearest neighbor algorithm.
    A is an NxN array indicating distance between N locations
    start is the index of the starting location
    Returns the path and cost of the found solution
    """
    path = [start]
    cost = 0
    N = A.shape[0]
    mask = np.ones(N, dtype=bool)  # boolean values indicating which 
                                   # locations have not been visited
    mask[start] = False

    for i in range(N-1):
        last = path[-1]
        next_ind = np.argmin(A[last][mask]) # find minimum of remaining locations
        next_loc = np.arange(N)[mask][next_ind] # convert to original location
        path.append(next_loc)
        mask[next_loc] = False
        cost += A[last, next_loc]

    return path, cost

..and here is an example of this function in use:

# Expected order is 0,2,3,1,4
A = np.array([
    [0, 2, 1, 2, 2],
    [2, 0, 2, 1, 1],
    [1, 2, 0, 1, 2],
    [2, 1, 1, 0, 2],
    [2, 1, 2, 2, 0]])
print NN(A,0)

Nearest Neighbour Algorithm

Answers (1)

Related Questions