Is it mathematically possible to solve this problem?

Question

x <- abs(rnorm(8))
C <- (x[1]*x[2]*x[3])^(1/3)
y <- log(x/C)

Is it mathematically possible to determine x[1:3] given you only have y? Here, x and y are always vectors of length 8. I should note that x is known for some of my dataset, which could be useful to find a solution for the other portion of the data where x is unknown. All of my code is implemented in R, so R code would be appreciated if this is solvable!

G. Grothendieck · Accepted Answer

Defining f as

f <- function(x) {
  C <- (x[1]*x[2]*x[3])^(1/3)
  log(x/C)
}

we first note that if k is any scalar constant then f(x) and f(k*x) give the same result so if we have y = f(x) we can't tell whether y came from x or from k*x. That is, y could have come from any scalar multiple of x; therefore, we cannot recover x from y.

Linear formulation

Although we cannot recover x we can determine x up to a scalar multiple. Define the matrix A:

ones <- rep(1, 8)
a <- c(1, 1, 1, 0, 0, 0, 0, 0)
A <- diag(8) - outer(ones, a) / 3

in which case f(x) equals:

A %*% log(x)

Inverting formula

From this formula, given y and solving for x, the value of x would equal

exp(solve(A) %*% y)  ##   would equal x if A were invertible

if A were invertible but unfortunately it is not. For example, rowSums(A) equals zero which shows that the columns of A are linearly dependent which implies non-invertibility.

all.equal(rowSums(A), rep(0, 8))
## [1] TRUE

Rank and nullspace

Note that A is a projection matrix. This follows from the fact that it is idempotent, i.e. A %*% A equals A.

all.equal(A %*% A, A)
## [1] TRUE

It also follows from the fact that its eigenvalues are all 0 and 1:

zapsmall(eigen(A)$values)
## [1] 1 1 1 1 1 1 1 0

From the eigenvalues we see that A has rank 7 (the number of nonzero eigenvalues) and the dimension of the nullspace is 1 (the number of zero eigenvalues).

Another way to see this is that knowing that A is a projection matrix its rank equals its trace, which is 7, so its nullspace must have dimension 8-7=1.

sum(diag(A))  # rank of A
## [1] 7

Taking scalar multiples spans a one dimensional space so from the fact that the nullspace has dimension 1 it must be the entirely of the values that map into the same y.

Key formula

Now replacing solve in ## above with the generalized inverse, ginv, we have this key formula for our approximation to x given that y = f(x) for some x:

library(MASS)
exp(ginv(A) %*% y)   # approximation to x accurate up to scalar multiple

or equivalently if y = f(x)

exp(y - mean(y))

While these do not give x they do determine x up to a scalar multiple. That is if x' is the value produced by the above expressions then x equals k * x' for some scalar constant k.

For example, using x and y from the question:

exp(ginv(A) %*% y)
##           [,1]
## [1,] 1.2321318
## [2,] 0.5060149
## [3,] 3.4266146
## [4,] 0.1550034
## [5,] 0.2842220
## [6,] 3.7703442
## [7,] 1.0132635
## [8,] 2.7810703

exp(y - mean(y))  # same
## [1] 1.2321318 0.5060149 3.4266146 0.1550034 0.2842220 3.7703442 1.0132635
## [8] 2.7810703

exp(y - mean(y))/x
## [1] 2.198368 2.198368 2.198368 2.198368 2.198368 2.198368 2.198368 2.198368

Note

Note that y - mean(y) can be written as

B <- diag(8) - outer(ones, ones) / 8
B %*% y

and if y = f(x) then y must be in the range of A so we can verify that:

all.equal(ginv(A) %*% A, B %*% A)
## [1] TRUE

It is not true that the matrix ginv(A) equals B. It is only true that they act the same on the range of A which is all that we need.