mitra mirshafiee
mitra mirshafiee

Reputation: 503

Where is the HuggingFace model saved in when loading a model on colab?

I have this code for loading a generative model. I'm not sure how to see model files in colab (i.e., config.json etc.).

model_id = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"


pipeline = transformers.pipeline(
            "text-generation",
            model=model_id,
            #model_kwargs={"torch_dtype": torch.bfloat16, "cache_dir": cache_dir},
            device_map="auto")

Upvotes: 1

Views: 29

Answers (1)

cronoik
cronoik

Reputation: 19475

You can locate everything that was downloaded for the model via the HF_HOME constant:

import os
from huggingface_hub.constants import HF_HOME 

print(HF_HOME)
print(*os.listdir(f"{HF_HOME}/hub"), sep="\n")

Output:

/root/.cache/huggingface
models--EleutherAI--gpt-neo-1.3B
.locks
models--gpt2
version.txt

Every directory prefixed with models contains the respective files, but the actual file names are hashed and this solution is probably not what you are looking for. Another alternative is saving it locally again:

from transformers import pipeline

local_path = "/content/my_model"
gen = pipeline("text-generation", model="EleutherAI/gpt-neo-1.3B")
gen.save_pretrained(local_path)
print(*os.listdir(local_path), sep="\n")

Output:

model-00002-of-00002.safetensors
vocab.json
merges.txt
tokenizer.json
config.json
model.safetensors.index.json
special_tokens_map.json
tokenizer_config.json
generation_config.json
model-00001-of-00002.safetensors

In case you are only interested in the config, you can also access it directly via the respective property:

print(gen.model.config)

Upvotes: 0

Related Questions