RuntimeError: mat1 and mat2 shapes cannot be multiplied

Question

I'm trying to input a 5D tensor with shape ( 1, 8, 32, 32, 32 ) to a VAE I wrote:

self.encoder = nn.Sequential(
        nn.Conv3d( 8, 16, 4, 2, 1 ), # 32 -> 16
        nn.BatchNorm3d( 16 ), 
        nn.LeakyReLU( 0.2 ),
        
        nn.Conv3d( 16, 32, 4, 2, 1 ), # 16 -> 8
        nn.BatchNorm3d( 32 ),
        nn.LeakyReLU( 0.2 ),
        
        nn.Conv3d( 32, 48, 4, 2, 1 ), # 16 -> 4
        nn.BatchNorm3d( 48 ),
        nn.LeakyReLU( 0.2 ), 
    )
    
    self.fc_mu = nn.Linear( 3072, 100 ) # 48*4*4*4 = 3072
    self.fc_logvar = nn.Linear( 3072, 100 )
    
self.decoder = nn.Sequential(
    nn.Linear( 100, 3072 ),
    nn.Unflatten( 1, ( 48, 4, 4 )),
    nn.ConvTranspose3d( 48, 32, 4, 2, 1 ), # 4 -> 8
    nn.BatchNorm3d( 32 ),
    nn.Tanh(),
        
    nn.ConvTranspose3d( 32, 16, 4, 2, 1 ), # 8 -> 16
    nn.BatchNorm3d( 16 ),
    nn.Tanh(),
        
    nn.ConvTranspose3d( 16, 8, 4, 2, 1 ), # 16 -> 32
    nn.BatchNorm3d( 8 ),
    nn.Tanh(), 
)

def reparametrize( self, mu, logvar ):
    std = torch.exp( 0.5 * logvar )
    eps = torch.randn_like(  std )
    return mu + eps * std 

def encode( self, x ) :
    x = self.encoder( x )
    x = x.view( -1, x.size( 1 ))
    
    mu = self.fc_mu( x )
    logvar = self.fc_logvar( x )
    
    return self.reparametrize( mu, logvar ), mu, logvar 
    
def decode( self, x ):
    return self.decoder( x )
    
def forward( self, data ):
    z, mu, logvar = self.encode( data )
    return self.decode( z ), mu, logvar

The error I'm getting is: RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x48 and 3072x100). I thought I had calculated the output dimensions from each layer correctly, but I must have made a mistake, but I'm not sure where.

Natthaphon Hongcharoen · Accepted Answer

This line

x = x.view( -1, x.size( 1 ))

Means you leave the second dimension(channel) as is and put everything else at the first dimension(batch).

And as the output of the self.encoder is (1, 48, 4, 4, 4), doing that means you'll get (64, 48) but from the look of it I think you want (1, 3072) instead.

So this should solve this particular problem.

x = x.view(x.size(0), -1)

Then you'll run into RuntimeError: unflatten: Provided sizes [48, 4, 4] don't multiply up to the size of dim 1 (3072) in the input tensor.

The cause is the unflatten here

nn.Linear(100, 3072),
nn.Unflatten(1, (48, 4, 4)),
nn.ConvTranspose3d(48, 32, 4, 2, 1)

Has to be (48, 4, 4, 4) instead.

RuntimeError: mat1 and mat2 shapes cannot be multiplied

Answers (1)

Related Questions