Reputation: 11
Current BERT base uncased clinical NER predict clinical entities( Problem, Test, Treatment)
I want to train on different clinical dataset to get entity like ( Disease, Medicine, Problem)
How to achieve that??
Upvotes: 1
Views: 491
Reputation: 396
There are several models in Huggingface which are trained on medical specific articles, those will definitely perform better than normal bert-base-uncased
. BioELECTRA is one of them and it managed to outperform existing biomedical NLP models in several benchmark tests.
There are 3 different versions of those models depending on their pretraining dataset. But I think these 2 will be the best to start with.
Bioelectra-base-discriminator-pubmed: Pretrained on pubmed
Bioelectra-base-discriminator-pubmed-pmc: Pretrained on pubmed
and pmc
Now coming to NER dataset there are several dataset you might like or you might want to create a composite dataset. Some of these are -
BC5-disease
, NCBI-disease
, BC5CDR-disease from BLUE benchmark
[Let me know if you need any help with model creation or setting up the finetuning setup. Also please use proper metrics to evaluate them and do share the metrics dashboard after it gets finished.]
Upvotes: 3