Reputation: 21
I'm using Ray with ray.remote to define an InferenceActor class, which includes a method run_inference which contains one parameter (A list of strings) for handling model inference tasks. However, when I execute the run_inference method for the first time, I encounter the following error:
Could not serialize the argument b'__RAY_DUMMY__' for a task or actor services.inference_actor.InferenceActor.run_inference
InferenceActor class:
ray.init(num_gpus=1)
@ray.remote(num_gpus=1)
class InferenceActor:
def __init__(self, settings: AppSettings):
self.model = LLM(
model=settings.llm_settings.model_path,
tokenizer=settings.llm_settings.tokenizer_path,
gpu_memory_utilization=settings.llm_settings.gpu_mem_limit,
)
self.sampling_parameters = SamplingParams(top_p=settings.extraction_settings.top_p,
temperature=settings.extraction_settings.temperature,
max_tokens=settings.extraction_settings.max_new_tokens,
stop=settings.extraction_settings.stop_sequence,
include_stop_str_in_output=True)
def run_inference(self, prompts: list[str]):
results = self.model.generate(prompts, self.sampling_parameters)
outputs = [result.outputs[0].text for result in results]
return outputs
It seems to be related to serialization, but I’m not sure what’s causing the issue or how to resolve it. Has anyone run into this problem before or have suggestions on what might be going wrong?
I have tried serialising the prompt argument with multiple different serialisation libraries:
Any insights would be greatly appreciated!
Thanks!
Upvotes: 0
Views: 26