Algorithm for finding a linear dependence with strictly positive coefficients

This must be surely well known, being a particular linear programming problem. What I want is a specific easy to implement efficient algorithm adapted to this very case, for relatively small sizes (about, say, ten vectors of dimension less than twenty).

I have vectors v(1), ..., v(m) of the same dimension. Want an algorithm that produces strictly positive numbers c(1), ..., c(m) such that c(1)v(1) + ... + c(m)v(m) is the zero vector, or tells for sure that no such numbers exist.

What I found (in some clever code by a colleague) gives an approximate algorithm like this:

start with, say, c(1) = ... = c(m) = 1/m;

at each stage, given current approximation v = c(1)v(1) + ... + c(m)v(m), seek for j such that v - v(j) is longer than v(j).

If no such j exists then output "no solution" (or c(1), ..., c(m) if v is zero).

If such j exists, change v to the new approximation (1 - c)v + cv(j) with some small positive c.

This changes c(j) to (1 - c)c(j) + c and each other c(i) to (1 - c)c(i), so that the new coefficients will remain positive and strictly less than 1 (in fact they will sum to 1 all the time, i. e. we will remain in the convex hull of the v(i)).

Moreover the new v will have strictly smaller length, so eventually the algorithm will either discover that there is no solution or will produce arbitrarily small v.

Clearly this is incomplete and not satisfactory from several points of view. Can one do better?

Update

There are by now two useful answers; however one final step is missing. They both boil down to the following (unless I miss some essential point).

Take a basis of the nullspace of v(1), ..., v(m). One obtains a collection of not necessarily strictly positive solutions c(1), ..., c(m), c'(1), ..., c'(m), c''(1), ..., c''(m), ... such that any such solution is their linear combination (in a unique way). So we are reduced to the question whether this new collection of m-dimensional vectors admits a linear combination with strictly positive entries.

Example: take four 2d-vectors (2,1), (3,-1), (-1,2), (-3,-3). Their nullspace has a basis consisting of two solutions c = (12,-3,0,5), c' = (-1,1,1,0). None of these are strictly positive but their combination c + 4c' = (8,1,4,5) is. So the latter is the desired solution. But in general it might be not so easy to find out whether a strictly positive solution exists and if yes, how to find it.

As suggested in the answer by btilly one might use Fourier-Motzkin elimination for that, but again, I would be grateful for more details about it.

Upvotes: 2

Views: 518

Answers (2)

Tomer Geva
Tomer Geva

Reputation: 1834

Denoting by A the matrix where the ith row is v(i) and by x the vector whose ith index is c(i), your problem can be describes as Ax = b where b=0 is the zero vector. The problem of Ax=b when b is not equal to zero is called the least squares problem (or the inhomogeneous least squares) and has a close form solution in the sense of Minimal Mean Square Error (MMSE). In your case however, b = 0 therefore we are in the homogeneous least squares problem. In Linear Algebra this can be looked as an eigenvalue problem, whose solution is the eigenvector x of the matrix A^TA whose eigenvalue is equal to 0. If no such eigenvalue exists, the MMSE solution will the the eigenvalue x whose matching eigenvalue is the smallest (closest to 0). A nice discussion on this topic is given here.

The solution is, as stated above, will be the eigenvector of A^TA with the lowest matching eigenvalue. This can be done using Singular Value Decomposition (SVD), which will decompose the matrix A into enter image description here The column of V matching with the lowest eigenvalue in the diagonal matrix Sigma will be your solution.

Explanation

When we want to minimize the Ax = 0 in the MSE sense, we can compute the vector derivative w.r.t x as follows:

enter image description here

Therefore, the eigenvector of A^TA matching the smallest eigenvalue will solve your problem.

Practical solution example

In python, you can use numpy.linalg.svd to perform the SVD decomposition. numpy orders the matrices U and V^T such that the leftmost column matches the largest eigenvalue and the rightmost column matches the lowest eigenvalue. Thus, you need to compute the SVD and take the rightmost column of the resulting V:

from numpy.linalg import svd
[_, _, vt] = svd(A)
x = vt[-1]  # we take the last row since this is a transposed matrix, so the last column of V is the last row of V^T

One zero eigenvalue

In this case there is only one non trivial vector who solves the problem and the only way to satisfy the strictly positive condition will be if the values in the vector are all positive or all negative (multiplying the vector by -1 will not change the result)

Multiple zero eigenvalues

In the case where we have multiple zero eigenvalues, any of their matching eigenvectors is a possible solution and any linear combination of them. In this case one would have to check if there is a linear combination of these eigenvectors which creates a vector where all the values are strictly positive in order to satisfy the strictly positive condition.

How do we find the solution if one exists? once we are left with the basis of eigenvectors matching zero eigenvalue (also known as null-space) what we need to do is to solve a system of linear inequalities. I'll explain by example, since it will be clearer this way. Suppose we have the following matrix:

import numpy as np
A = np.array([[ 2,  3, -1, -3],
              [ 1, -1,  2, -3]])
[_, Sigma, Vt] = np.linalg.svd(A)  # Sigma has only 2 non-zero values, meaning that the null-space have a dimension of 2

We can extract the eigenvectors as explained above:

C = Vt[len(Sigma):]
# array([[-0.10292809,  0.59058542,  0.75313786,  0.27092073],
#        [ 0.89356997, -0.15289589,  0.09399548,  0.4114856 ]])

What we want to find are two real coefficients, noted as x and y such that:

-0.10292809*x + 0.89356997*y > 0
 0.59058542*x - 0.15289589*y > 0
 0.75313786*x + 0.09399548*y > 0
 0.27092073*x + 0.4114856*y  > 0

We have a system of 4 inequalities with 2 variables, therefore in this case a solution is not promised. A solution can be found in many ways but I will propose the following. We can start with an initial guess and go over each hyperplane to check if the initial guess satisfies the inequality. if not we can reflect the guess to the other side of the hyperplane. After passing all the hyperplanes we check for a solution. (explanation of hot to reflect a point w.r.t a line can be found here). An example for python implementation will be:

import numpy as np

def get_strictly_positive(A):
    [_, Sigma, Vt] = np.linalg.svd(A)
    if len(Sigma[np.abs(Sigma) > 1e-5]) == Vt.shape[0]:  # No zero eigenvalues, taking MMSE solution if exists
        c = Vt[-1]
        if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
            return c if np.sum(c) == np.sum(abs(c)) else -1 * c
        else:
            return -1
    # This means we have a zero solution
    # Building matrix C of all the null-space basis vectors
    C = Vt[len(Sigma[np.abs(Sigma) > 1e-5]):]
    # 1. What we have here is a set of linear system of inequalities. Each equation inequality is a hyperplane and for
    # each equation there is a valid half-space. We want to find the intersection of all the half-spaces, if it exists.
    # 2. A vey important observations is that the basis of the null-space that we found using SVD is ORTHOGONAL!
    coeffs = np.ones(C.shape[0])  # initial guess
    for hyperplane in C.T:
        if coeffs.dot(hyperplane) <= 0:  # the guess is on the wrong side of the hyperplane
            orthogonal_part = coeffs - (coeffs.dot(hyperplane) / hyperplane.dot(hyperplane)) * hyperplane
            # reflecting the coefficients to the other side of the hyperplane
            coeffs = 2 * orthogonal_part - coeffs
    # If this yielded a solution, we return it
    c = C.T.dot(coeffs)
    if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
        return c if np.sum(c) == np.sum(abs(c)) else -1 * c
    else:
        return -1
  • The equations are taken from one of my summaries and therefore I do not have a link to the source

Upvotes: 1

btilly
btilly

Reputation: 46408

This is doable as follows.

First write your vectors as columns. Put them into a matrix. Now create a single column with entries c(1), c(2), ..., c(m_)). If you multiply that matrix times that column, you get your linear combination.

Now consider the elementary row operations. Multiply a row by a constant, swap two rows, add a multiple of one row to another. If you do an elementary row operation to the matrix, your linear combination after the row operation will be 0 if and only if it was before the row operation. Therefore doing elementary row operations DOESN'T CHANGE the coefficients that you're looking for.

Therefore you may simplify life by doing elementary row operations to put the matrix into reduced row echelon form. Once it is in reduced row echelon form, life gets easier. Columns which do not contain a pivot correspond to free coefficients. Columns which do contain a pivot correspond to coefficients that must be a specific linear combination of free coefficients. This reduces your problem being to find positive values for the free coefficients that make the others also positive. So you're now just solving a system of inequalities (and generally in far fewer variables).

Whether a system of linear inequalities has a solution can be answered with the FME method.

Upvotes: 1

Related Questions