Fine tune/train a pre-trained BERT on data with no sentences but only words (bank transactions)

Question

I have a lot of bank-transactions which I want to classify into different categories. The issue is that the text is not a sentence as such but consists only of words e.g "private withdrawal", "payment invoice 19234", "taxes" etc.

Since the domain is so specific, I think we might get a better performance by fine-tune a already pre-trained BERT, compared to just use the pre-trained BERT right away, but how do we do that when we don't have any sentences? I.e how would the "guess next sentence" part be created? Or can we skip it?

Fine tune/train a pre-trained BERT on data with no sentences but only words (bank transactions)

Answers (1)

Related Questions