Reputation: 21
I have created a convolutional network using pytorch and i want to optimize its' hyperparameters, e.g kernel_size, in_channels, out_channels of each layer dropout_rate etc. The problem is i cannot find a way to parametrize the in_features of the final linear layer which means i have to hard code it every time according to the input size and the internal architecture. Is there a way to define the in_features of the fc layer to be equal to the output of the dropout layer ?
class Conv_v0(torch.nn.Module):
def __init__(self):
super(Conv_v0, self).__init__()
self.conv1 = torch.nn.Conv1d(in_channels=4, out_channels=3, kernel_size=17)
self.activation = torch.nn.ReLU()
self.maxpool = torch.nn.MaxPool1d(kernel_size=5)
self.dropout = torch.nn.Dropout(p=0.5)
#in_features = 108 for 200kb, 588 for 1kb, 1188 for 2kb
self.fc = torch.nn.Linear(in_features=1188, out_features=2)
#self.sigmoid = torch.nn.Sigmoid() will not be used since its intergraded in BCEWithLogitsLoss()
def forward(self, x):
x= x.permute(0, 2, 1)
x = self.conv1(x)
x = self.activation(x)
x = self.maxpool(x)
# Reshape the output of the max pooling layer before passing it to the fully connected layer
x = x.view(x.size(0), -1)
#print("Size after reshaping:", x.size())
x=self.dropout(x)
x = self.fc(x)
#x = self.sigmoid(x)
return x
I have tried initializing in_features=0 and changing it accordingly on the forward function but i am afraid this way the weigths of the final layer are initialized in every forward pass thus no learning is achieved
class Conv2(nn.Module):
def __init__(self, out_channels1, kernel_size1, out_channels2, kernel_size2, dropout_rate):
super(Conv2, self).__init__()
self.conv1 = nn.Conv1d(in_channels=4, out_channels=out_channels1, kernel_size=kernel_size1)
self.conv2 = nn.Conv1d(in_channels = out_channels1, out_channels = out_channels2, kernel_size=kernel_size2)
self.dropout = torch.nn.Dropout(p=dropout_rate)
self.fc = torch.nn.Linear(in_features=0, out_features=2)
def forward(self, x):
x = x.permute(0, 2, 1)
x = F.max_pool1d(torch.tanh(self.conv1(x)), 2)
x = F.max_pool1d(torch.tanh(self.conv2(x)), 2)
x = x.view(x.size(0), -1)
x = self.dropout(x)
# Dynamically set the in_features for the fc layer
if self.fc.in_features == 0:
self.fc.in_features = x.size(1)
self.fc = nn.Linear(in_features=x.size(1), out_features=2)
out = self.fc(x)
return out
Upvotes: 1
Views: 85
Reputation: 42117
You can use torch.nn.LazyLinear
instead of the torch.nn.Linear
module.
The advantage of LazyLinear
is that the value of the in_features
argument is inferred lazily after the first forward pass, from the shape of the last dimension of the input to the module. In your case, that would be the output of the dropout layer.
Essentially, you would need:
self.fc = torch.nn.LazyLinear(out_features=2)
FWIW, the weight and bias would also be initialized after the first forward pass (they start as uninitialized).
Upvotes: 1