Reputation: 11431
I have been reading Algorithm design manual. I have the same question as What is the meaning of "from distinct vertex chains" in this nearest neighbor algorithm? but I am not able to follow the answers there.
A different idea might be to repeatedly connect the closest pair of endpoints whose connection will not create a problem, such as premature termination of the cycle. Each vertex begins as its own single vertex chain. After merging everything together, we will end up with a single chain containing all the points in it. Connecting the final two endpoints gives us a cycle. At any step during the execution of this closest-pair heuristic, we will have a set of single vertices and vertex-disjoint chains available to merge. In pseudocode:
ClosestPair(P)
Let n be the number of points in set P.
For i = 1 to n − 1 do
d = ∞
For each pair of endpoints (s, t) from distinct vertex chains
if dist(s, t) ≤ d then sm = s, tm = t, and d = dist(s, t)
Connect (sm, tm) by an edge
Connect the two endpoints by an edge
Please note that sm and tm should be sm and tm.
I am not able to follow above logic. Please demonstrate the computation for the simple example which is given in the book: -21, -5, -1, 0, 1, 3, and 11
. Show the computations step by step, so that one can follow above code easily.
Upvotes: 1
Views: 2789
Reputation: 60958
In the following notation, I use parentheses to denote chains. Each vertex starts out as its first chain. The inner loop iterates over all pairs of chain endpoints, i.e. all pairs of nodes which have a parenthesis written immediately next to them, but only those pairs where the two endpoints come from different chains. The results of this inner loop are a pair of endpoints which minimize the distance d
. I'll assume that pairs are sorted s < t
, and furthermore that pairs are traversed in lexicographical order. In that case, the rightmost pair matching that minimal d
will be returned, due to the ≤
in the code.
(-21), (-5), (-1), (0), (1), (3), (11) d = 1, sm = 0, tm = 1
(-21), (-5), (-1), (0 , 1), (3), (11) d = 1, sm = -1, tm = 0
(-21), (-5), (-1 , 0 , 1), (3), (11) d = 2, sm = 1, tm = 3
(-21), (-5), (-1 , 0 , 1 , 3), (11) d = 4, sm = -5, tm = -1
(-21), (-5 , -1 , 0 , 1 , 3), (11) d = 8, sm = 3, tm = 11
(-21), (-5 , -1 , 0 , 1 , 3 , 11) d = 16, sm = -21, tm = -5
(-21 , -5 , -1 , 0 , 1 , 3 , 11) d = 32 to close the loop
So in this example, the code works as intended.
Figure 1.4 will give an example where this code does not work, i.e. will yield a suboptimal result. Labeling the vertices like this
A <--(1+e)--> B <--(1+e)--> C
^ ^ ^
| | |
(1-e) (1-e) (1-e)
| | |
v v v
D <--(1+e)--> E <--(1+e)--> F
In this case you'll get
(A), (B), (C), (D), (E), (F) d = 1-e, sm = C, tm = F
(C, F), (A), (B), (D), (E) d = 1-e, sm = B, tm = E
(C, F), (B, E), (A), (D) d = 1-e, sm = A, tm = D
(C, F), (B, E), (A, D) d = 1+e, sm = E, tm = F
(B, E, F, C), (A, D) d = 1+e, sm = A, tm = B
(D, A, B, E, F, C) d = sqrt((2+2e)^2+(1-e)^2) to close
which is not the optimal solution.
Upvotes: 8