Reputation: 122168
Given an input tensor of shape (C, B, H) torch.Size([2, 5, 32])
of some neural net layers, where
channels
= 2batch_size
= 5hidden_size
= 32The goal is to flatten the channels and manipulate the input tensor to the shape (B, C*H) torch.Size([5, 2 * 32])
, where:
batch_size
= 5hidden_size
= 32 * 2I've tried to do the following:
import torch
t = torch.rand([2, 5, 32])
# Changed from (channels, batch_size, hidden_size)
# -> (batch_size, channels, hidden_size)
t = t.permute(1, 0, 2)
# Reshape using view(), where batch_size is t.size(0)
# and -1 is to flatten the left over values to the other dimension.
z = t.contiguous().view(t.size(0), -1)
print(z.shape)
print(z)
[out]:
torch.Size([5, 64])
tensor([[0.3911, 0.9586, 0.2104, 0.3937, 0.9976, 0.3378, 0.0630, 0.6676, 0.0806,
0.9311, 0.5219, 0.1697, 0.7442, 0.5162, 0.2555, 0.0826, 0.5502, 0.9700,
0.3375, 0.5012, 0.9025, 0.8176, 0.1465, 0.1848, 0.3460, 0.9999, 0.7892,
0.7577, 0.6615, 0.2620, 0.6868, 0.2003, 0.4840, 0.8354, 0.9253, 0.3172,
0.9516, 0.8962, 0.1272, 0.2268, 0.6510, 0.5166, 0.6772, 0.9616, 0.9826,
0.5254, 0.9191, 0.4378, 0.7048, 0.8808, 0.0299, 0.1102, 0.9710, 0.8714,
0.7256, 0.9684, 0.6117, 0.1957, 0.8663, 0.4742, 0.2843, 0.6548, 0.9592,
0.1559],
[0.2333, 0.0858, 0.5284, 0.2965, 0.3863, 0.3370, 0.6940, 0.3387, 0.3513,
0.1022, 0.3731, 0.3575, 0.7095, 0.0053, 0.7024, 0.4091, 0.3289, 0.5808,
0.5640, 0.8847, 0.7584, 0.8878, 0.9873, 0.0525, 0.7731, 0.2501, 0.9926,
0.5226, 0.0925, 0.0300, 0.4176, 0.0456, 0.4643, 0.4497, 0.5920, 0.9519,
0.6647, 0.2379, 0.4927, 0.9666, 0.1675, 0.9887, 0.7741, 0.5668, 0.7376,
0.4452, 0.7449, 0.1298, 0.9065, 0.3561, 0.5813, 0.1439, 0.2115, 0.5874,
0.2038, 0.1066, 0.3843, 0.6179, 0.8321, 0.9428, 0.1067, 0.5045, 0.9324,
0.3326],
[0.6556, 0.1479, 0.9288, 0.9238, 0.1324, 0.0718, 0.6620, 0.2659, 0.7162,
0.7559, 0.7564, 0.2120, 0.3943, 0.9497, 0.7520, 0.8455, 0.4444, 0.4708,
0.8371, 0.6365, 0.3616, 0.0326, 0.1581, 0.4973, 0.6701, 0.9245, 0.8274,
0.3464, 0.7044, 0.5376, 0.0441, 0.5210, 0.8603, 0.7396, 0.2544, 0.3514,
0.5686, 0.3283, 0.7248, 0.4303, 0.9531, 0.5587, 0.8703, 0.1585, 0.9161,
0.9043, 0.9778, 0.4489, 0.9463, 0.8655, 0.5576, 0.1135, 0.1268, 0.3424,
0.1504, 0.2265, 0.1734, 0.1872, 0.3995, 0.1191, 0.0532, 0.6109, 0.1662,
0.6937],
[0.6342, 0.1922, 0.1758, 0.4625, 0.7654, 0.6509, 0.2908, 0.1546, 0.4768,
0.3779, 0.2490, 0.0086, 0.6170, 0.5425, 0.6953, 0.4730, 0.5834, 0.8326,
0.0165, 0.8236, 0.0023, 0.7479, 0.5621, 0.9894, 0.5957, 0.0857, 0.6087,
0.5667, 0.5478, 0.8197, 0.9228, 0.7329, 0.4434, 0.5894, 0.9860, 0.6133,
0.2395, 0.4718, 0.8830, 0.6361, 0.6104, 0.6630, 0.5084, 0.7604, 0.7591,
0.3601, 0.6888, 0.6767, 0.9178, 0.5291, 0.0591, 0.4320, 0.7875, 0.5038,
0.4419, 0.0319, 0.3719, 0.5843, 0.0334, 0.3525, 0.0023, 0.1205, 0.4040,
0.7908],
[0.0989, 0.8436, 0.0425, 0.6247, 0.6091, 0.4778, 0.2692, 0.4785, 0.9217,
0.9604, 0.6355, 0.4686, 0.9414, 0.7722, 0.8013, 0.1660, 0.6578, 0.6414,
0.6814, 0.6212, 0.4124, 0.7102, 0.7416, 0.7404, 0.9842, 0.6542, 0.0106,
0.3826, 0.5529, 0.8079, 0.9855, 0.3012, 0.2341, 0.9353, 0.6597, 0.7177,
0.8214, 0.1438, 0.4729, 0.6747, 0.9310, 0.4167, 0.3689, 0.8464, 0.9395,
0.9407, 0.8419, 0.5486, 0.1786, 0.1423, 0.9900, 0.9365, 0.3996, 0.1862,
0.6232, 0.7547, 0.7779, 0.4767, 0.6218, 0.9079, 0.6153, 0.1488, 0.5960,
0.4015]])
Although the permute()
+ view()
achieve the desired output, are there other ways to perform the same operation? Is there a better way that can directly rehape without first permutating the order of the shape?
Upvotes: 4
Views: 3982
Reputation: 8538
Einops allows doing such element rearrangements in one (readable) line
from einops import rearrange
import torch
t = torch.rand([2, 5, 32])
y = rearrange(t, 'c b h -> b (c h)')
y.shape # prints torch.Size([5, 64])
Upvotes: 0
Reputation: 114866
Let's look "behind the curtain" and see why one must have both permute
/transpose
and view
in order to go from a C
-B
-H
to B
-C*H
:
Elements of tensors are stored as a long contiguous vector in memory. For instance, if you look at a 2-3-4 tensor it has 24 elements stored at 24 consecutive places in memory. This tensor also has a "header" that tells pytorch to treat these 24 values as a 2-by-3-by-4 tensor. This is done by storing not only the size
of the tensor, but also "strides": what is the "stride" one need to jump in order to get to the next element along each dimension. In our example, size=(2,3,4)
and strides=(12, 4, 1)
(you can check this out yourself, and you can see more about it here).
Now, if you only want to change the size
to 2-(3*4)
you do not need to move any item of the tensor in memory, only to update the "header" of the tensor. By setting size=(2, 12)
and strides=(12, 1)
you are done!
Alternatively, if you want to "transpose" the tensor to 3-2-4
that's a bit more tricky, but you can still do that by manipulating the strides. Setting size=(3, 2, 4)
and strides=(4, 12, 1)
gives you exactly what you want without moving any of the real tensor elements in memory.
However, once you manipulated the strides, you cannot trivially change the size of the tensor - because now you will need to have two different "stride" values for one (or more) dimensions. This is why you must call contiguous()
at this point.
Summary
If you want to move from shape (C, B, H)
to (B, C*H)
you must have permute
, contiguous
and view
operations, otherwise you just scramble the entries of your tensor.
A small example with 2-3-4
tensor:
a =
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
If you just change the view
of the tensor you get
a.view(3,8)
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 8, 9, 10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21, 22, 23]])
Which is not what you want!
You need to have
a.permute(1,0,2).contiguous().view(3, 8)
array([[ 0, 1, 2, 3, 12, 13, 14, 15],
[ 4, 5, 6, 7, 16, 17, 18, 19],
[ 8, 9, 10, 11, 20, 21, 22, 23]])
Upvotes: 5