Reputation: 365
For example, I want to train a BERT model from scratch but using the existing configuration. Is the following code the correct way to do so?
model = BertModel.from_pretrained('bert-base-cased')
model.init_weights()
Because I think the init_weights
method will re-initialize all the weights.
Second question, if I want to change a bit the configuration, such as the number of hidden layers.
model = BertModel.from_pretrained('bert-base-cased', num_hidden_layers=10)
model.init_weights()
I wonder if the above is the correct way to do so. Because they don't appear to have an error when I run the above code.
Upvotes: 4
Views: 2133
Reputation: 11240
In this way, you would unnecessarily download and load the pre-trained model weights. You can avoid that by downloading the BERT config
config = transformers.AutoConfig.from_pretrained("bert-base-cased")
model = transformers.AutoModel.from_config(config)
Both yours and this solution assume you want to tokenize the input in the same as the original BERT and use the same vocabulary. If you want to use a different vocabulary, you can change in the config before instantiating the model:
config.vocab_size = 123456
Similarly, you can change any hyperparameter that you want to have different from the original BERT.
Upvotes: 4