Reputation: 1215
Can BERT be used for non-text sequence data? I want to try BERT for sequence classification problems. The data is not text. In other words, I want to train BERT from scratch. How do I do that?
Upvotes: 1
Views: 1316
Reputation: 11213
The Transformer architecture can be used for anything as long as it is a sequence of discrete symbols. BERT is trained using the marked language model objective, i.e., it is trained to fill in a gap in a sequence based on the rest of the sequence. If your data is of that kind, you can train a BERT-like model on it. With sequences of continuous vectors, you would need to come up with a suitable alternative to masked language modeling.
You can follow any of the many tutorials that you can find online, e.g., from the Huggingface blog or towardsdatascience.com.
Upvotes: 1