Reputation: 177
I have a pretrained spacy model on a local folder that I can easily read with m = spacy.load("path/model/")
But now I have to upload it as a .tar.gz file to use as a Sagemaker model artifact. How can I read this .tar.gz file?
Ideally I want to read the unzipped folder from memory. Without extracting all to disk and then reading it again
My question is almost a duplicate of this one Directly load spacy model from packaged tar.gz file. But the answers don't explain how to untar unzip the folder into memory
Upvotes: 1
Views: 670
Reputation: 177
Turns out Sagemaker already decompress the .tar.gz
file automatically.
So I can just read the folder exactly like before.
Upvotes: 0
Reputation: 15593
Take a look at the serialization docs. You don't want to read the unzipped folder from memory (I'm not sure how that would work exactly), but you can use simple in-memory serialization, for example. In that case you save the config and the model separately.
To save:
config = nlp.config
bytes_data = nlp.to_bytes()
To read back:
lang_cls = spacy.util.get_lang_class(config["nlp"]["lang"])
nlp = lang_cls.from_config(config)
nlp.from_bytes(bytes_data)
Upvotes: 0