Reputation: 61
I'm trying to get the vocabulary from some publicly-available pre-trained models (that aren't mine) using the python interface of AllenNLP, using self.vocab
. However, I'm running into problems trying to load in the model. I'm looking to get the vocabulary from the dygiepp models, using the following code:
from allennlp.models.model import Model
scierc_model = Model.from_archive('https://s3-us-west-2.amazonaws.com/ai2-s2-research/dygiepp/master/scierc.tar.gz')
However, I get the following error:
---------------------------------------------------------------------------
ConfigurationError Traceback (most recent call last)
/tmp/local/63381207/ipykernel_7616/3549263982.py in <module>
----> 1 scierc_model = Model.from_archive('https://s3-us-west-2.amazonaws.com/ai2-s2-research/dygiepp/master/scierc.tar.gz')
~/anaconda3/envs/dygiepp/lib/python3.7/site-packages/allennlp/models/model.py in from_archive(cls, archive_file, vocab)
480 from allennlp.models.archival import load_archive # here to avoid circular imports
481
--> 482 model = load_archive(archive_file).model
483 if vocab:
484 model.vocab.extend_from_vocab(vocab)
~/anaconda3/envs/dygiepp/lib/python3.7/site-packages/allennlp/models/archival.py in load_archive(archive_file, cuda_device, overrides, weights_file)
231 # Instantiate model and dataset readers. Use a duplicate of the config, as it will get consumed.
232 dataset_reader, validation_dataset_reader = _load_dataset_readers(
--> 233 config.duplicate(), serialization_dir
234 )
235 model = _load_model(config.duplicate(), weights_path, serialization_dir, cuda_device)
~/anaconda3/envs/dygiepp/lib/python3.7/site-packages/allennlp/models/archival.py in _load_dataset_readers(config, serialization_dir)
267
268 dataset_reader = DatasetReader.from_params(
--> 269 dataset_reader_params, serialization_dir=serialization_dir
270 )
271 validation_dataset_reader = DatasetReader.from_params(
~/anaconda3/envs/dygiepp/lib/python3.7/site-packages/allennlp/common/from_params.py in from_params(cls, params, constructor_to_call, constructor_to_inspect, **extras)
586 "type",
587 choices=as_registrable.list_available(),
--> 588 default_to_first_choice=default_to_first_choice,
589 )
590 subclass, constructor_name = as_registrable.resolve_class_name(choice)
~/anaconda3/envs/dygiepp/lib/python3.7/site-packages/allennlp/common/params.py in pop_choice(self, key, choices, default_to_first_choice, allow_class_names)
322 """{"model": "my_module.models.MyModel"} to have it imported automatically."""
323 )
--> 324 raise ConfigurationError(message)
325 return value
326
ConfigurationError: dygie not in acceptable choices for dataset_reader.type: ['babi', 'conll2003', 'interleaving', 'multitask', 'multitask_shim', 'sequence_tagging', 'sharded', 'text_classification_json']. You should either use the --include-package flag to make sure the correct module is loaded, or use a fully qualified class name in your config file like {"model": "my_module.models.MyModel"} to have it imported automatically.
The error describes how to fix the error from the command line, but not in the python interface. I additionally tried adding the line import dygie
to my code to import the missing package, but that didn't solve the problem.
Wondering if anyone knows how to get around this?
Upvotes: 0
Views: 133
Reputation: 2627
To run this model, you'll need to have the code from this repo: https://github.com/dwadden/dygiepp.
In particular, you need to import the DyGIE dataset reader from here: https://github.com/dwadden/dygiepp/blob/master/dygie/data/dataset_readers/dygie.py#L29
Upvotes: 0