Reputation: 121
What I would consider a diagonal tensor is a tensor t of shape (d1, ..., dr) which is all zero except when the components are equal. So t[i,j,k,l] = 0 unless i == j == k == l. A function to create such a tensor should take in a shape (d1, ..., dr) and a vector [a1, ..., ak] of length min(d1, ..., dr), placing these values along the diagonal.
I would like to do this in Tensorflow, and the most relevant function I could find was tf.linalg.tensor_diag, but it doesn't do what I want. For instance, the diagonal input is a tensor, and the output tensor always has twice the rank, and so it can never output tensors of odd rank.
The documentation says "Given a diagonal, this operation returns a tensor with the diagonal and everything else padded with zeros", but I don't know how to square that with its actual behavior.
My question is two parts:
What is the best way in TF to do create what I am calling a diagonal tensor. Is there another name for this?
Why does linalg.tensor_diag work like this? What is the intended use?
Here is an example output:
>>> tf.linalg.tensor_diag([1,2],[3,4]])
<tf.Tensor: shape=(2, 2, 2, 2), dtype=int32, numpy=
array([[[[1, 0],
[0, 0]],
[[0, 2],
[0, 0]]],
[[[0, 0],
[3, 0]],
[[0, 0],
[0, 4]]]], dtype=int32)>```
Upvotes: 1
Views: 1624
Reputation: 676
So this is a little tricky to think about but I'll try to explain the thinking.
If you do tf.linalg.tensor_diag([1,2,3,4])
this is intuitively gives a matrix with that diagonal:
[[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]]
Notice you went from rank 1 to rank 2 doing this, the rank doubled. So to "diagonalize" it's going to end up doubling the rank.
Now your question, tf.linalg.tensor_diag([[1,2],[3,4]])
What you're passing in is a matrix so rank 2
[[1, 2],
[3, 4]]
But now, how should this be diagonalized? So it's rank 2 and following the pattern means we'll end up with something of rank 4. In the previous example diagonalize sort of "pulled up" the vector into the higher rank. And each step of "pulling up" took a single value from the diagonal and put it there.
So this matrix will also be "pulled up" and each step of the way leaving a value. So it's going to make 4 squares of [[0,0],[0,0]]
and drop the value in each one. This would give us
[[1,0],
[0,0]]
[[0,2],
[0,0]]
[[0,0],
[3,0]]
[[0,0],
[0,4]]
Lastly things will be "grouped" if they were originally (like [1,2]
idk how better to say this) so that gives the final result of
[
[
[[1,0],
[0,0]] ,
[[0,2],
[0,0]]
],
[
[[0,0],
[3,0]] ,
[[0,0],
[0,4]]
]
]
Which indeed gives us a rank 4 result 👍
Note: You may want to look into the other diag function for more of you're trying to do
Upvotes: 1