Reputation: 1653
I'm training a transformer model with OpenNMT-py on MIDI music files, but results are poor because I only have access to a small dataset pertaining to the style I want to study. To help the model learn something useful, I would like to use a much larger dataset of other styles of music for a pre-training and then fine-tune the results using the small dataset.
I was thinking of freezing the encoder side of the transformer after the pre-training and letting the decoder part free to do the fine-tuning. How would one do this with OpenNMT-py?
Upvotes: 0
Views: 573
Reputation: 37691
Please be more specific about your questions and show some code which will help you to get a productive response from the SO community.
If I were in your place and wanted to freeze a neural network component, I would simply do:
for name, param in self.encoder.named_parameters():
param.requires_grad = False
Here I assume you have a NN module like as follows.
class Net(nn.Module):
def __init__(self, params):
super(Net, self).__init__()
self.encoder = TransformerEncoder(num_layers,
d_model,
heads,
d_ff,
dropout,
embeddings,
max_relative_positions)
def foward(self):
# write your code
Upvotes: 1