Reputation: 1137
First of all, sorry for any newbie mistakes that I've made. But I couldn't figure out and couldn't find a source specifically for deeppavlov (NER) library. I'm trying to train ner_ontonotes_bert_mult as described here. I guess it can be trained from its checkpoint to make it recognize some specific patterns like;
"Round 23/22; 24,9 x 12,2 x 12,3"
as
[[['Round', '23/22', ';', '24,9 x 12,2 x 12,3']], [['B-PRODUCT', 'I-PRODUCT', 'B-QUANTITY']]]
My questions are (before I dig into details):
I don't even understand if it is possible but I've decided to give it go and prepared 3 .txt
files as "train.txt"
, "test.txt"
and "validation.txt"
as described in deeppovlov web page. And I put them under the folder '~/.deeppavlov/downloads/ontonotes/ner_ontonotes_bert_mult'
. My dataset looks like this:
Round B-PRODUCT
23/22 I-PRODUCT
24,9 x 12,2 x 12,3 B-QUANTITY
Ring B-PRODUCT
HDFAA I-PRODUCT
12,7 x 10 B-QUANTITY
and so on... This is the code I am trying to train it:
import os
# Force tensorflow to use CPU instead of GPU.
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
from deeppavlov import configs, train_model
from deeppavlov.core.commands.utils import parse_config
config_dict = parse_config(configs.ner.ner_ontonotes_bert_mult)
print(config_dict['dataset_reader']['data_path'])
from deeppavlov import configs, train_model
ner_model = train_model(configs.ner.ner_ontonotes_bert_mult)
But I am getting this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [37]
[[{{node save/Assign_280}}]]
Full traceback:
2019-09-26 15:50:27.63 ERROR in 'deeppavlov.core.common.params'['params'] at line 110: Exception in <class 'deeppavlov.models.bert.bert_ner.BertNerModel'>
Traceback (most recent call last):
File "/home/custom_user/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/custom_user/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/custom_user/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [37]
[[{{node save/Assign_280}}]]
And I realized I can't use samples like " Round 23/22; 24,9 x 12,2 x 12,3 ". I need them to be in full sentences.
It seems like this is happening due to my dataset. My custom dataset only has 3 tags (B-PRODUCT
, I-PRODUCT
and B-QUANTITY
) but the pre-trained model has 37 of them. All available tags can be found here under the sentence of "The list of available tags and their descriptions are presented below."
. 18 main tags(with B
and I
36 tags), and O
tag (“O” means the absence of entity.)). Total of all of the 37 tags needs to be present in the dataset. I was able to pass that error by adding dummy sentences by tagging them all with the missing tags. This is a terrible workaround since I'm willingly disrupting my own data-set. I'm still looking for a 'logical' way to train...
PS: Now I am getting this error.
Traceback (most recent call last):
File "/home/custom_user/.PyCharm2019.2/config/scratches/scratch_9.py", line 13, in <module>
ner_model = train_model(configs.ner.ner_ontonotes_bert_mult)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/__init__.py", line 31, in train_model
train_evaluate_model_from_config(config, download=download, recursive=recursive)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/commands/train.py", line 121, in train_evaluate_model_from_config
trainer.train(iterator)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/trainers/nn_trainer.py", line 294, in train
self.train_on_batches(iterator)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/trainers/nn_trainer.py", line 234, in train_on_batches
self._validate(iterator)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/trainers/nn_trainer.py", line 150, in _validate
metrics = list(report['metrics'].items())
AttributeError: 'NoneType' object has no attribute 'items'
Upvotes: 3
Views: 2575
Reputation: 151
I tried deeppavlov training, and successfully trained the 'ner' model
I also got the same error at first while training, then I overcome by researching more about it
things to know before training -
-> you can find the 'ner_ontonotes_bert_multi.json' config file link in deeppavlov doc, which gives the dataset path, pretrained model path , dataset_reader and chain pipe to train
-> there is a pretrained model in the directory mentioned in the 'config' ,by default it is inside 'C:/users/{user_name}/.deeppavlov/' is the root directory and pretrained models are gonna store in 'models' subdirectory
-> when you started training the already trained model is gonna be modified which means, training just try to improve the pre-trained model
so to train and build your own model (by scratch), simply delete the 'models' subdirectory from the '.deeppavlov' path and execute the training
Upvotes: 1
Reputation: 66
There are at least two problems here:
1. instead of validation.txt
there should be a valid.txt
file;
2. you are trying to retrain a model that was pretrained on a different dataset with a different set of tags, it's not necessary.
To train your model from scratch you can do something like:
import json
from deeppavlov import configs, build_model, train_model
with configs.ner.ner_ontonotes_bert_mult.open(encoding='utf8') as f:
ner_config = json.load(f)
ner_config['dataset_reader']['data_path'] = '~/my_data_dir/' # directory with train.txt, valid.txt and test.txt files
ner_config['metadata']['variables']['NER_PATH'] = '~/where_to_save_the_model/'
ner_config['metadata']['download'] = [ner_config['metadata']['download'][-1]] # do not download the pretrained ontonotes model
ner_model = train_model(ner_config, download=True)
The other thing that could go wrong is tokenization: "Round 23/22; 24,9 x 12,2 x 12,3"
will be split by the model to ['Round', '23', '/', '22', ';', '24', ',', '9', 'x', '12', ',', '2', 'x', '12', ',', '3']
and not ['Round', '23/22', ';', '24,9 x 12,2 x 12,3']
.
But you can tokenize your texts beforehand:
ner_model([['Round', '23/22', ';', '24,9 x 12,2 x 12,3']])
Upvotes: 4