Devs love ZenUML
Devs love ZenUML

Reputation: 11872

Get the position of points on a line segment with constraints

I am designing a layout engine for ZenUML. One requirement (after simplification) is this:

  1. There a line segment;
  2. There are n (n < 100, it can be reduced to n < 30 if that makes difference in performance) points on this line segment, the order is fixed; (e.g. P1 ~ Pn)
  3. There are known minimum distances between some points; (e.g. m_d(P2,P4)=500)
  4. The length of the line segment should be as small as possible;
  5. (nice to have) The gaps between neighbour points should be as even as possible (measured by standard deviation^, and must not violate 1~4).
  6. (newly added) it must give result in no longer than 1s for 30 points in worst case (more or less constraints).

^ standard deviation is used as a best alternative that I can precisely describe with my limited math knowledge. However as Roman and David pointed out, there is more appealing result in some cases that does not satisfy lowest standard deviation. See this comment

A sister questions is asked at https://math.stackexchange.com/q/4377433. It is the same problem described with math language (matrix).

For example, for a line segment that has 4 points on it (P1 ~ P4).

m_d(P1, P3) = 200, m_d(P1, P4) = 900.

Some background of this problem. Have a look at the following diagram.

I have implemented the following which satisfies constraints #1~#4. Can some please help with constraint #5? It can be partially met (like b in the example) or fully met (like c in the example). We can use standard deviation to measure the quality of the solution. Updated: thanks to Roman Svistunov, the base implementation to get non-optimal position is dramatically simplified.

// Known minimum distance is stored in a matrix
// e.g.
//    m = [
//      [0, 100, 0, 900],
//      [0, 0,   0, 0],
//      [0, 0,   0, 0],
//      [0, 0,   0, 0],
//    ]

let f = function (n: number, m: Array<Array<number>>): number {
  let array = Array(n).fill(0);
  for (let i = 0; i < n; i++) {
    array[i] = f(i, m) + m[i][n];
  }
  return Math.max(0, ...array);
}

Upvotes: 14

Views: 992

Answers (3)

David Eisenstat
David Eisenstat

Reputation: 65458

Since you indicated that an approximation would be OK, here’s a simple code that computes approximately the same result as my other answer. The adjacency list version is longer but likely significantly faster on sparse constraint matrices.

// Increase for faster but less accurate results.
const PIXEL = 1;

// Returns a layout where each gap is within PIXEL of optimal.
function layOut(n, m) {
  let positions = Array(n).fill(0);
  let gaps = Array(n).fill(0);
  while (true) {
    // Compute the leftmost layout.
    for (let j = 1; j < n; j++) {
      positions[j] = positions[j - 1] + gaps[j - 1];
      for (let i = 0; i < j; i++) {
        positions[j] = Math.max(positions[j], positions[i] + m[i][j]);
      }
    }
    // Compute the rightmost layout. As we do so, each gap that can grow, grows.
    let stillGrowing = false;
    for (let i = n - 2; i > -1; i--) {
      let proposedGap = gaps[i] + PIXEL;
      // i + 1 is rightmost; i is leftmost.
      let maxGap = positions[i + 1] - positions[i];
      if (proposedGap < maxGap) {
        gaps[i] = proposedGap;
        stillGrowing = true;
      } else {
        gaps[i] = maxGap;
      }
      positions[i] = positions[i + 1] - gaps[i];
      for (let j = n - 1; j > i; j--) {
        positions[i] = Math.min(positions[i], positions[j] - m[i][j]);
      }
    }
    if (!stillGrowing) {
      return positions;
    }
  }
}

// Adjacency list version.
function layOut(n, m) {
  let leftArcs = [];
  let rightArcs = [];
  for (let i = 0; i < n; i++) leftArcs.push([]);
  for (let j = 0; j < n; j++) rightArcs.push([]);
  for (let j = 1; j < n; j++) {
    for (let i = 0; i < j; i++) {
      let x = m[i][j];
      if (x > 0) {
        leftArcs[j].push([i, x]);
        rightArcs[i].push([j, x]);
      }
    }
  }
  let positions = Array(n).fill(0);
  let gaps = Array(n).fill(0);
  while (true) {
    // Compute the leftmost layout.
    for (let j = 1; j < n; j++) {
      positions[j] = positions[j - 1] + gaps[j - 1];
      for (let [i, x] of leftArcs[j]) {
        positions[j] = Math.max(positions[j], positions[i] + x);
      }
    }
    // Compute the rightmost layout. As we do so, each gap that can grow, grows.
    let stillGrowing = false;
    for (let i = n - 2; i > -1; i--) {
      let proposedGap = gaps[i] + PIXEL;
      // i + 1 is rightmost; i is leftmost.
      let maxGap = positions[i + 1] - positions[i];
      if (proposedGap < maxGap) {
        gaps[i] = proposedGap;
        stillGrowing = true;
      } else {
        gaps[i] = maxGap;
      }
      positions[i] = positions[i + 1] - gaps[i];
      for (let [j, x] of rightArcs[i]) {
        positions[i] = Math.min(positions[i], positions[j] - x);
      }
    }
    if (!stillGrowing) {
      return positions;
    }
  }
}

// Various test cases.
console.log(
  layOut(4, [
    [0, 200, 0, 900],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(5, [
    [0, 0, 200, 0, 0],
    [0, 0, 0, 550, 0],
    [0, 0, 0, 0, 800],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 150],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 200],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 300],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

Upvotes: 2

David Eisenstat
David Eisenstat

Reputation: 65458

I’ll work through some theory and then give some JavaScript code that meets 1–4 and 6, a modified version of 5, and additionally

  1. is robust to numerical issues
  2. has no external dependencies.

We can express 1–4 as a linear program. Your sample instance, for example, would translate as

minimize x4 - x1
subject to

(known minimum distances)
a: x3 - x1 >= 200
b: x4 - x1 >= 900

(fixed order)
c: x2 - x1 >= 0
d: x3 - x2 >= 0
e: x4 - x3 >= 0

x1, x2, x3, x4 unbounded

Every linear program has a dual program with the same optimum objective value (under conditions that are met here). The dual of the program above is below:

maximize 200 a + 900 b
subject to

x1: -a - b - c = -1
x2: c - d = 0
x3: a + d - e = 0
x4: b + e = 1

a, b, c, d, e >= 0

This is a max-cost flow problem with no capacity constraints. In consequence, some path is an optimal flow, so we can understand criteria 1–4 as a longest path problem.

Once we have the optimal objective, 900 here, we can shift criterion 4 to a constraint

x4 - x1 <= 900

and think about how to make the most pleasing layout. As written, we can achieve criterion 5 by solving a quadratic program:

minimize (x2 - x1)**2 + (x3 - x2)**2 + (x4 - x3)**2
subject to

(most efficient layout)
x4 - x1 <= 900

(known minimum distances)
x3 - x1 >= 200
x4 - x1 >= 900

(fixed order)
x2 - x1 >= 0
x3 - x2 >= 0
x4 - x3 >= 0

x1, x2, x3, x4 unbounded

The objective that I’ve written is not the standard deviation, but it doesn’t matter. Minimizing the standard deviation is equivalent to minimizing the variance. In turn, the variance of a random variable X is E[X**2] - E[X]**2, and E[X]**2 is a constant here (E[X] = 900/4 = 150 in the example). Finally, maximizing E[X**2] is equivalent to maximizing the sum of squares of the equally likely outcomes.

I would propose a different criterion however.

5′. Maximize (in lexicographic order) the list of gap lengths sorted small to large.

What this means is that we try as hard as possible to avoid small gaps. In an example from Roman Svistunov, suppose that we have constraints

x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

The solution with minimum standard deviation sets

x1 = 0
x2 = 0.75
x3 = 2
x4 = 6.25
x5 = 10

whereas the solution that optimizes the replacement objective sets

x1 = 0
x2 = 1
x3 = 2
x4 = 6.5
x5 = 10

We’ve made x2 more centered at the cost of making x4 less centered. Since x2 is closer to its neighbors than x4, this seems like a reasonable trade to me, but you obviously have the final say.

To optimize 5′, we proceed as follows. Start by formulating a linear program like

maximize z
subject to

(most efficient layout)
x5 - x1 <= 10

(known minimum distances)
x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

(fixed order)
x2 - x1 >= z
x3 - x2 >= z
x4 - x3 >= z
x5 - x4 >= z

x1, x2, x3, x4, x5 unbounded

We solve for the optimum, z = 1 here, and also record which gaps prevented it from being larger. We replace those z variables and solve again, repeating until there is no occurrence of z.

maximize z
subject to

(most efficient layout)
x5 - x1 <= 10

(known minimum distances)
x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

(fixed order)
x2 - x1 = 1
x3 - x2 = 1
x4 - x3 >= z
x5 - x4 >= z

x1, x2, x3, x4, x5 unbounded

The optimum is z = 3.5.

maximize z
subject to

(most efficient layout)
x5 - x1 <= 10

(known minimum distances)
x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

(fixed order)
x2 - x1 = 1
x3 - x2 = 1
x4 - x3 >= z
x5 - x4 = 3.5

x1, x2, x3, x4, x5 unbounded

The optimum is z = 4.5, and that’s the last iteration.

As observed earlier, we can solve these linear programs using an extended longest path algorithm. In JavaScript:

// Dual numbers represent linear functions of time.
function Dual(p, v) {
  return { position: p, velocity: v };
}

// Adds dual numbers x and y.
function dualPlus(x, y) {
  return Dual(x.position + y.position, x.velocity + y.velocity);
}

const epsilon = Math.sqrt(Number.EPSILON);

// Compares dual numbers x and y by their position at a time infinitesimally
// after now.
function dualLessThan(x, y) {
  let d = x.position - y.position;
  return d < -epsilon || (Math.abs(d) <= epsilon && x.velocity < y.velocity);
}

// Tracks delta, the length of time for which every pair of arguments to
// .dualLessThan() maintains the present relation.
function DeltaTracker() {
  return {
    delta: Infinity,
    dualLessThan: function (x, y) {
      let lessThan = dualLessThan(x, y);
      if (lessThan) {
        [x, y] = [y, x];
      }
      if (x.velocity < y.velocity) {
        this.delta = Math.min(
          this.delta,
          (x.position - y.position) / (y.velocity - x.velocity)
        );
      }
      return lessThan;
    },
  };
}

// Converts the adjacency matrix to an adjacency list representation.
function graphFromMatrix(n, matrix) {
  let graph = [];
  for (let j = 0; j < n; j++) {
    graph.push([]);
    for (let i = 0; i < j; i++) {
      if (matrix[i][j] > 0) {
        graph[j].push({ i: i, length: Dual(matrix[i][j], 0) });
      }
    }
  }
  return graph;
}

// Essentially the usual algorithm for longest path, but tracks delta.
function longestPathTable(graph, gaps) {
  let tracker = DeltaTracker();
  let maximum = Dual(0, 0);
  let table = [];
  for (let j = 0; j < graph.length; j++) {
    let argument = null;
    if (j > 0) {
      maximum = dualPlus(maximum, gaps[j - 1]);
    }
    for (let edge of graph[j]) {
      let x = dualPlus(table[edge.i].maximum, edge.length);
      if (tracker.dualLessThan(maximum, x)) {
        argument = edge.i;
        maximum = x;
      }
    }
    table.push({ argument: argument, maximum: maximum });
  }
  return [tracker.delta, table];
}

// Essentially the usual algorithm for decoding the longest path from the
// dynamic program table.
function freezeCriticalGaps(table, graph, gaps) {
  let j = table.length - 1;
  while (j > 0) {
    let critical = true;
    let argument = table[j].argument;
    if (argument !== null) {
      j = argument;
    } else {
      j--;
      gaps[j].velocity = 0;
    }
  }
}

// Changes the time from now to now + delta.
function advanceGaps(gaps, delta) {
  for (let i = 0; i < gaps.length; i++) {
    gaps[i].position += gaps[i].velocity * delta;
  }
}

// Extracts the final result from the dynamic program table.
function positionsFromTable(table) {
  let positions = [];
  for (let entry of table) {
    positions.push(entry.maximum.position);
  }
  return positions;
}

// Entry point for the layout algorithm.
function layOut(n, matrix) {
  let graph = graphFromMatrix(n, matrix);
  let gaps = [];
  for (let j = 1; j < n; j++) {
    gaps.push(Dual(0, 1));
  }
  while (true) {
    let [delta, table] = longestPathTable(graph, gaps);
    if (delta == Infinity) {
      return positionsFromTable(table);
    }
    if (table[n - 1].maximum.velocity > 0) {
      freezeCriticalGaps(table, graph, gaps);
    } else {
      advanceGaps(gaps, delta);
    }
  }
}

// Various test cases.
console.log(
  layOut(4, [
    [0, 200, 0, 900],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(5, [
    [0, 0, 2, 0, 0],
    [0, 0, 0, 5.5, 0],
    [0, 0, 0, 0, 8],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 150],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 200],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 300],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

Upvotes: 8

Roman Svistunov
Roman Svistunov

Reputation: 1079

First, we should find the smallest possible length of the segment. This can be done with the greedy algorithm: place the first point at 0 (without loss of generality) and add the following points one by one with a coordinate as small as possible without violating distances to all previous points. We would fix the last point at its position and would not move it any further. I would suppose that correctness is obvious.

Now the tricky part with optimizing the standard deviation of neighboring distances. We would introduce n-1 variables - the distances, I would name them Di. Trying to minimize their standard deviation is the same as minimizing variance.

And because E(D) is constant as we fixed the first and last points, minimizing standard deviation is the same as minimizing the sum of differences.

Constraints on distances are all linear as whenever you have m_d(Pi,Pj) you would add constraint (sum Dk for k=i to j) >= m_d(Pi,Pj). To guarantee that sum of distances is what we need it to be we also would add contracts sum Dk is both less or equal and greater or equal to the minimum distance calculated in step one. Therefore this problem is a quadratic optimization problem, that can be solved with libraries.

Note that the coordinates for the points then would be easily restored from distances as prefix sums.

Even though for 30 points probably any algorithm of quadratic programming would be fast enough if you want to invest time into further optimization if you intend to increase the number of points, as this problem can actually be solved exactly in polynomial time.

The first and most important optimization. Because the target function is the sum of squares, the matrix of target quadratic form is positive definite (as it is the identity matrix), and therefore there exists a polynomial algorithm, as have been shown in this paper.

The last optimization I would like to mention might not be relevant, but it is very good on most of the possible m_d matrices. To reduce the quadratic programming problem size, we can find optimal positions for some points before quadratic programming.

First, we would make reductions in matrix to ensure that for any i<j<k it is true that m_d(Pi,Pj)+m_d(Pj,Pk)>=m_d(Pi,Pk) with updating the right side. Note that with this computation m_d(P1,Pn) is also the minimum possible length of the segment. After this, we can say that for any i, if m_d(P1,Pi)+m_d(Pi,Pn)=m_d(P1,Pn) is true there is only one possible position for the point Pi, and, this allows us to remove one of the variables as now distance from P(i-1) to Pi and from Pi to P(i+1) can be replaced with one variable - distance from P(i-1) to P(i+1): the first distance is just Pi position - P(i-1) position with can be derived by induction, and the P(i+1) position is P(i-1) position + this new distance.

Edit 2

There are two versions of this pseudocode, both assume you use some kind of quadratic programming library, first one is the intuitive one (and used in a few libraries), the second one is for libraries that use matrices as their input (but less intuitive):

function findOptimal(Matrix md, int n) {
  int minLength = findMinLength(md, n);
  Constraints constraints; //from library
  for (int i = 0; i < n; ++i) {
    for (int j = 0; j < m; ++j) {
      if (j < i)
        //Here sum of variables is not their current sum
        //but some object for library
        constraints.AddConstranit(sum of variables i to j >= md[j][i])
      else
        constraints.AddContraint(sum of variables j to i >= md[j][i])
    }
  }
  constraints.AddConstraint(sum of variables 0 to n-1 <= minLength)
  variables = QuadraticOptimizationSolve(sum of squares of variables, contracts)
  positions = Array<number>(n);
  positions[0] = 0;
  for (int i = 0; i < n; ++i) {
    positions[i + 1] = positions[i] + variables[i];
  }
  return positions;
}

As matrices are more common as inputs in libraries, here is the pseudocode to work with them:

function findOptimal(Matrix md, int n) {
  int minLength = findMinLength(md, n);
  Matrix optimization = Matrix(n - 1, n - 1);
  for (int i = 0; i < n - 1; ++i) {
    for (int j = 0; j < n - 1; ++j) {
      if (i == j)
        optimization[i][j] = 1
      else
        optimization[i][j] = 0
    }
  }
  Matrix constraints = Matrix(n * (n - 1) / 2 + 1, n)
  for (int i = 1; i < n; ++i) {
    for (int j = 0; j < i; ++j) {
      constraintRow = i * (i - 1) / 2 + j;
      //constriant is sum of distances from j to i is 
      //at least md[j][i]
      for (int k = j; k < i; ++k) {
        //these are coefficients in linear combination
        //negative because we need >= while optimizers usually use <= contraints
        if (k >= k and k < i)
          constraints[constraintRow][k] = -1
        else
          constraints[constraintRow][k] = 0
      }
      constraints[constraintsRow][n - 1] = -md[i][j];
    }
  }
  variables = QuadraticOptimizationSolve(optimization, constraints);
  positions = Array<number>(n);
  positions[0] = 0;
  for (int i = 0; i < n; ++i) {
    positions[i + 1] = positions[i] + variables[i];
  }
  return positions;
}
Also this solution assumes in your library the constants vector in constraints is the last column, but sometimes it might be separate vector

Edit 3

This is the javascript solution for your problem. Easily handles 30 points in a second (actually on my setup it can process up to 100 points in a second). It uses a different implementation of findMinLength function than yours (yours does not save the result of recursive calls and is exponential because of that). Also, I was unable to find a good library for quadratic programming, so I ended up using this solution (was edited in edit 4). This is for node js, but combining all the files (and removing requires and exports, for vsmall you need to define vsmall as epsilon in their file), and this library seems to work. It appears that the previous library was just a browser adaptation of this one, and this one had an issue that was fixed years after the adaptation was made, but it is quite easy to do it manually (as it has no dependencies).

let findMinLength = function (n, m) {
  let answer = [];
  answer[0] = 0;
  for (let i = 1; i < n; i++) {
    answer[i] = 0;
    for (let j = 0; j < i; j++) {
      answer[i] = Math.max(answer[i], answer[j] + m[i][j], answer[j] + m[j][i]);
    }
  }
  return answer[n - 1];
}
let findOptimal = function(n, m) {
  let minLength = findMinLength(n, m);
  let dmat = [];
  let dvec = [];
  for (let i = 0; i < n - 1; i++) {
    dmat[i + 1] = []
    dvec[i + 1] = 0
    for (let j = 0; j < n - 1; j++) {
      if (i == j)
        dmat[i + 1][j + 1] = 1;
      else
        dmat[i + 1][j + 1] = 0;
    }
  }
  let amat = [];
  for (let i = 0; i < n - 1; i++) {
    amat[i + 1] = [];
    amat[i + 1][1] = -1;
  }
  let bvec = [];
  bvec[1] = -1;
  for (let i = 1; i < n; i++) {
    for (let j = 0; j < i; j++) {
      let constraintRow = i * (i - 1) / 2 + j;
      for (let k = 0; k < n - 1; k++) {
        amat[k + 1][constraintRow + 2] = (k >= j && k < i) ? 1 : 0;
      }
      bvec[constraintRow + 2]=Math.max(m[i][j], m[j][i]) / minLength;
    }
  }
  let variables = solveQP(dmat,dvec,amat,bvec).solution;
  let positions = [];
  positions[0] = 0;
  for (let i = 0; i < n - 1; i++) {
    positions[i + 1] = positions[i] + variables[i + 1] * minLength;
  }
  return positions;
}

For example, calling findOptimal(4, m) where m is as in your example (with 4 points), the function returns [0,300,600,900]

Edit

As noted in @david-eisenstat there exists an interpretation of the 5th point as maximizing minimum gap, but there is a simpler and asymptotically faster solution, namely, answer binary search.

We know that the minimum gap is at least 0 and at most the segment length / n. We can maximize it with binary search, with a checker ensuring that every gap is at least our value. If findMinLength is a function that solves 1-4, the following is the solution that optimizes this metric (pseudocode):

function findOptimal(Matrix md, int n) {
  minLength = findMinLength(md, n) //The answer to minimize segment length
  minGapL = 0 //Lower bound in maximized minimum gap length
  minGapR = ceiling(minLength / n) //Upper bound in maximized minimum gap
  while minGapL < minGapR {
    minGapTest = (minGapL + minGapR) / 2
    mdCopy = md.copy()
    //This loop ensures that every gap is at least minimum gap tested
    for (int i = 1; i < n; ++i) {
      mdCopy[i - 1][i] = max(mdCopy[i - 1][i], miGapTest)
    }
    int testLength = findMinLength(mdCopy, n)
    if (testLength == minLength) {
      //if it is possible to get tested minimum gap update lower bound
      minGapL = minGapTest
    } else {
      //otherwise update lower bound
      minGapR = minGapTest - 1;
    }
  }
  return (minLength, minGapL);
}

With this function returning both length and maximized minGap one can reconstruct the answer with the matrix modification as in the loop and then solve just problem for 1-4. This solution is polynomial (the 1-4 problem can be solved in linear time as shown above, and binary search is logarithmic of the answer, and with the answer being exponential of the input size it is also linear of the input size).

Upvotes: 5

Related Questions