Allan-J
Allan-J

Reputation: 365

Correct Way to Fine-Tune/Train HuggingFace's Model from scratch (PyTorch)

For example, I want to train a BERT model from scratch but using the existing configuration. Is the following code the correct way to do so?

model = BertModel.from_pretrained('bert-base-cased')
model.init_weights()

Because I think the init_weights method will re-initialize all the weights.

Second question, if I want to change a bit the configuration, such as the number of hidden layers.

model = BertModel.from_pretrained('bert-base-cased', num_hidden_layers=10)
model.init_weights()

I wonder if the above is the correct way to do so. Because they don't appear to have an error when I run the above code.

Upvotes: 4

Views: 2133

Answers (1)

Jindřich
Jindřich

Reputation: 11240

In this way, you would unnecessarily download and load the pre-trained model weights. You can avoid that by downloading the BERT config

config = transformers.AutoConfig.from_pretrained("bert-base-cased")
model = transformers.AutoModel.from_config(config)

Both yours and this solution assume you want to tokenize the input in the same as the original BERT and use the same vocabulary. If you want to use a different vocabulary, you can change in the config before instantiating the model:

config.vocab_size = 123456

Similarly, you can change any hyperparameter that you want to have different from the original BERT.

Upvotes: 4

Related Questions