codeKarma
codeKarma

Reputation: 117

How to convert a AutoModelForCausalLM object to a dspy model object?

  1. import dspy

  2. llm = dspy.HFModel(model='model')

This method takes a string as input for the model if i have a quantized model object of the class AutoModelForCausalLM How i can convert the model object to dspy object?

direct assignment gives error on inference

  1. llm = model #previously created as AutoModelForCausalLM class object

  2. llm("Testing testing, is anyone out there?")

Error After Code Line 4 File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:623, in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict) 621 raise ValueError("You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time") 622 elif input_ids is not None: --> 623 batch_size, seq_length = input_ids.shape 624 elif inputs_embeds is not None: 625 batch_size, seq_length, _ = inputs_embeds.shape

AttributeError: 'str' object has no attribute 'shape'

Upvotes: 0

Views: 107

Answers (1)

codeKarma
codeKarma

Reputation: 117

I think i have missed the documentation details.

nit signature: dspy.HFModel( model: str, checkpoint: Optional[str] = None, is_client: bool = False, hf_device_map: Literal['auto', 'balanced', 'balanced_low_0', 'sequential'] = 'auto', token: Optional[str] = None, model_kwargs: Optional[dict] = {}, ) Docstring: Abstract class for language models. Init docstring:

Args: model (str): HF model identifier to load and use checkpoint (str, optional): load specific checkpoints of the model. Defaults to None. is_client (bool, optional): whether to access models via client. Defaults to False. hf_device_map (str, optional): HF config strategy to load the model. Recommeded to use "auto", which will help loading large models using accelerate. Defaults to "auto". model_kwargs (dict, optional): additional kwargs to pass to the model constructor. Defaults to empty dict. File: /opt/conda/lib/python3.11/site-packages/dsp/modules/hf.py Type: ABCMeta Subclasses: HFClientTGI, HFClientVLLM, Together, Anyscale, ChatModuleClient, HFClientSGLang

So after reading this i have made following changes and the code work

import dspy model_specific_param = {"torch_dtype": torch.float16,'quantization_config':bnb_config} model_name = '/tmp/models/llama2/7b' llm = dspy.HFModel(model=model_name,model_kwargs = model_specific_param)

Upvotes: 0

Related Questions