linear-algebraunits-of-measurementtype-systems

Reputation: 14485

How do units flow through matrix operations?

It seems that there are many useful applications for matrix math where not all entries in a given matrix share the same units. I want to look into type systems that could track these units and ensure we don't make mistakes (similar to a number of libraries and languages that already do dimension checking for scalar arithmetic). I'll give an example of what I am talking about, and then I have a few questions building from there.

(cribbing a random mixed-units linear programming example from here, although this is not a homework question as will hopefully become clear)

Bob’s bakery sells bagel and muffins. To bake a dozen bagels Bob needs 5 cups of flour, 2 eggs, and one cup of sugar. To bake a dozen muffins Bob needs 4 cups of flour, 4 eggs and two cups of sugar. Bob can sell bagels in $10/dozen and muffins in $12/dozen. Bob has 50 cups of flour, 30 eggs and 20 cups of sugar. How many bagels and muffins should Bob bake in order to maximize his revenue?

So let's put that in matrix form (arbitrary concrete syntax...):

 A = [ [ 5 cups of flour / dozen bagels, 4 cups of flour / dozen muffins ],
       [ 2 eggs          / dozen bagels, 4 eggs          / dozen muffins ],
       [ 1 cups of sugar / dozen bagels, 2 cups of sugar / dozen muffins ] ]

 B = [ [ 10 dollars      / dozen bagels, 12 dollars      / dozen muffins ] ]

 C = [ [ 50 cups of flour ],
       [ 30 eggs          ],
       [ 20 cups of sugar ] ]

We now want to maximize the inner product B * X such that A * X <= C and X >= 0, an ordinary linear programming problem.

In a hypothetical language with unit checking, how would we ideally represent the types of these matrices?

I'm thinking that an m by n matrix only needs m + n units and not the full m * n units, because unless the units are distributed in a sensible way into rows and columns then the only sensible operation remaining is to add/subtract the fully general matrix with another of the exact same shape or multiply it by a scalar.

What I mean is that the arrangement of units in A is far more useful than that in:

WTF = [ [ 6 pigeons,      11 cups of sugar ],
        [ 1 cup of sugar, 27 meters        ],
        [ 2 ohms,         2 meters         ] ]

And that furthermore situations like the latter simply don't arise in practice. (Anyone got a counterexample?)

Under this simplifying assumption, we can represent the units of a matrix with m + n units as follows. For each of the m rows, we figure out what units are common to all entries in that row, and similarly for the n columns. Let's put the row units in column vectors and the column units in row vectors because that makes Units(M) = RowUnits(M) * ColUnits(M), which seems like a nice property. So, in the example:

RowUnits(A) = [ [ cups of flour ],
                [ eggs ],
                [ cups of sugar ] ]
ColUnits(A) = [ [ dozen bagels ^ -1, dozen muffins ^ -1 ] ]

RowUnits(B) = [ [ dollars ] ]
ColUnits(B) = [ [ dozen bagels ^ -1, dozen muffins ^ -1 ] ]

RowUnits(C) = [ [ cups of flour ],
                [ eggs ],
                [ cups of sugar ] ]
ColUnits(C) = [ [ 1 ] ]

It seems that (although I'm not sure how to prove it...) the units of M1 * M2 are RowUnits(M1 * M2) = RowUnits(M1), ColUnits(M1 * M2) = ColUnits(M2), and for the multiplication to make sense we require ColUnits(M1) * RowUnits(M2) = 1.

We can now infer units for X, because the expression A * X <= C must mean that A * X and C have the same units. This means that RowUnits(A) = RowUnits(C) (which checks out), ColUnits(X) = ColUnits(C), and RowUnits(X) is the element-wise reciprocal of the transpose of ColUnits(A), in other words RowUnits(X) = [ [ dozen bagels ], [ dozen muffins ] ].

("Hooray", I hear you cheering, "we have just gone around the moon to look at something completely obvious!")

My questions are these:

Can you think of real world examples where elements of a matrix have units that do not fall into "row units" and "column units" like this?
Can you think of an elegant way to deal with situations where the same unit is a factor in every cell, and so it could equivalently be placed in every "row" or in every "column" and thus the row units and column units are not a unique representation? What should the side condition be that holds them in "lowest terms" and removes silliness like multiplying every row by furlongs ^ 17 just so that you can multiply every column by furlongs ^ -17?
Can you prove the rules I mentioned for propagating these unit annotations through matrix multiplications?
Can you discover/show the rule for how these unit annotations propagate through matrix inverse operations? Some hand calculations I did with a 2x2 matrix suggest that the units of Inverse(M) are the element-wise reciprocal of the units of Transpose(M), but I don't know how to show it for the general case or even if it is true for the general case.
Are you aware of any academic work on these issues? Or software that performs this static analysis for programs in some language? I may be using the wrong search terms, but I am having trouble finding anything.

My real-world applications of interest are preventing screw ups in signal processing/controller software by making sure that all the filter gains etc have the correct units everywhere, using matrices like these with different units in different cells is extremely common in those applications.

Upvotes: 4

Answers (3)

Daniel

Reputation: 26

A very good source of information about the topic is the Book "Multidimensional analysis" by George W. Hart which is also referenced in the comments to the question. Regarding the individual questions (mathematical results are explained in the book):

First of all, we need to introduce a little notation. The square bracket operator extracts the physical unit(s) of a matrix / matrix entry, i.e. $\left[A\right]_{ij}$ gives us the unit of the entry (i, j) in the matrix A. Furthermore, the ~ operator transposes a vector / matrix and inverts its units, i.e

$\left[A^{\sim }\right]_{ij} = \frac{1}{\left[A\right]_{ji}$

Then we need another result from the book which says that the units in any multipliable matrix A can be written as $\left[A\right] = \left[a b^\sim\right]$ , i.e. the units of the matrix entries can be described by the outer product of 2 vectors a and b where a is a column vector and b^\sim is a row vector with its units inverted.

Answer to 4): the units of the inverse of a matrix A are given by $\left[A\right] = \left[a b^\sim\right] \Rightarrow \left[A^{-1}\right] = \left[b a^\sim\right]$ which means that the units are the element-wise reciprocal of the original units as already proposed in the original question.

Regarding 3): Yes, your rules regarding matrix multiplication are correct if the inner units are identical and thus cancel out, i.e. $\left[AB\right] = \left[a b^\sim\right] \left[b c^\sim\right] = \left[a c^\sim\right]$ .

However, it would also be possible to multiply matrices where the inner column/row units are dimensionally parallel: $\left[AB\right] = \left[a b^\sim\right] \left[c d^\sim\right] = b/c \left[a d^\sim\right]$ (here b/c informally refers to be the common unit factor by which c has to be multiplied to obtain c)

Regarding 2): Usually each row or column of a matrix refers to a specific physical quantity which induces the canonical representation for the row and column units. As an example, consider a (x,y) position covariance matrix in a Kalman Filtering application. There, it is clear that the rows and columns describe x- and y-position in a certain physical unit, e.g. all row and column units are [m], leading to a unit of [m^2] for each of the matrix elements. It would also be possible to use [m/s] and [s*m] as row and column units which would also lead to a unit of [m^2] for each element.

However, it should be clear from the problem description that the first representation is the more meaningful.

Regarding 5): There is the book by George W. Hart and a very short paper with a few key results from the book https://www.georgehart.com/research/tdm.ps . As far as implementations in software are concerned, this talk from Meeting C++ 2021 https://www.youtube.com/watch?v=4LmMwhM8ODI describes the implementation of a library that is capable of annotating matrices and vectors with physical units where all checks are done at compile time. The presented library is also able to fully handle the Discrete Kalman Filter example that is described in one of the other answers.

Upvotes: 1

achary

Reputation: 1

I think that good real-world example - an answer to your question No 1 - could be Discrete Kalman Filter implementation. In general, its equations operate on tuples and matrices where some of them might represent values that correspond to physical units.

As long as there is a need for multidimensional Kalman filter calculation, where estimated results are not homogeneous in value type, it looks to me that there will be are inversions and transpositions, as well as multiplication between matrices having certain (symmetrical) mixture of values representing different physical things on their elements.

I've just touched the problem when trying to implement sensor fusion algorithm using DKF with C++ and strongly typed value representation in the code.

There are plenty of Kalman filter implementations where developers simply use vectors and matrices of integers, floats, doubles etc, but I'd like to keep track of dimension of every values used and then the issue appeared.

Unfortunately, I'm quite fresh to the problem (still working on it), so I'm not yet able to help you with answers 2 to 5 right now :-/

Upvotes: 0

aka.nice

Reputation: 9572

4) If you admit that inv(M)*M = Identity is dimensionless (Id*X=X), this is the same proof as 3)

5) I would inquire if it is possible to extend BOOST UNITS http://www.boost.org/doc/libs/1_50_0/doc/html/boost_units.html and eventually contact the authors

1 & 3)
Your dimensional problem Adim*Xdim=Ydim can be transformed into a dimensionless problem A*X=Y

you divide your input parameters X by their respective units(X)
and you multiply your outputs Y by their respective units(Y) afterwards

That is

Ydim=diag(units(Y))*Y
Ydim=diag(units(Y))*A*X
Ydim=diag(units(Y))*A*inv(diag(units(X)))*Xdim
Ydim=Adim*Xdim

Or the other way

Adim*Xdim=Ydim
Adim*diag(units(X))*X=diag(units(Y))*Y
Adim*diag(units(X))*X=diag(units(Y))*A*X

This is true for whatever X, so you find that

A=inv(diag(units(Y))) * Adim * diag(units(X))
Adim=diag(units(Y)) * A * inv(diag(units(X)))

That is you multiply rows of dimensionless A with units of Y and columns by inverse of units of X

2) Mathematically you trivially factor a scalar quantity out of the matrix, so I think the question is more related to 5), how do you represent this in soft. You'll have to take this feature as a requirement when answering 5)

Upvotes: 1

How do units flow through matrix operations?

Answers (3)

Related Questions